Pattern Recognition Letters 20 (1999) 541±547
www.elsevier.nl/locate/patrec
Landmark recognition using invariant features Barbara Zitov a *, Jan Flusser
1
Department of Image Processing, Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Pod vod arenskou v ezõ 4, 182 08 Prague 8, Czech Republic Received 27 October 1997; received in revised form 9 March 1999
Abstract The paper deals with a view-invariant recognition of circular landmarks used for mobile robot navigation. The recognition model based on the ane moment invariants (AMIs) is introduced. The recognition ability of the AMIs regarding this particular landmark shape is investigated in the presence of additive random noise and/or in the case of various viewing angles. The results of the experiments in real situations, which proved the discriminability and the stability of the recognition model, are shown. Ó 1999 Elsevier Science B.V. All rights reserved. Keywords: Landmark recognition; Ane moment invariants; Robot navigation
1. Introduction Recent technical developments bring new possibilities of robot employment. Some tasks require mobile robots to act autonomously. They must have correct notion about their position and what should be their next action. Moravec's Cart (Moravec, 1981) represents one of the ®rst solutions of the robot navigation. Current robot position can be estimated using its trajectory information and a previous robot representation of the world. The robot constructs its own map based on the data acquired by its sensors (Leonard et al., 1990; Brooks, 1986; Shah and Aggrawal, 1995). After an incorporation of the robot motion trajectory and the data from sensors into the evolving world representation, the
*
Corresponding author. Tel.: +420 2 6605 2357; fax: +420 2 688 4903; e-mail:
[email protected] 1 E-mail: ¯
[email protected] robot can make a decision about its position and about its future actions. This approach is appropriate in a situation when the robot working space is not static or is complicated. In some tasks we can assume we know a priori the map of the robot work environment (Cox and Wilfong, 1990). An appropriate representation of the robot's working space is stored in its memory and it is used as the reference model for decisions. The robot using its sensors collects new information about the surrounding world and by matching the new data with the reference model gets an idea about its position (Kosaka and Kak, 1993; Dulimarta and Jain, 1997). The estimation of the robot position can be achieved by landmarks. We use the term landmark for an object in the scene, which is found distinctive by the robot. Landmarks can be used to impart certain type of information to the robot (Lewitt et al., 1987). For example the recovery from the failure during robot navigation can be based on ®nding landmarks (Kosaka and Kak, 1993). The
0167-8655/99/$ ± see front matter Ó 1999 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 6 5 5 ( 9 9 ) 0 0 0 3 1 - 8
542
B. Zitov a, J. Flusser / Pattern Recognition Letters 20 (1999) 541±547
landmarks can be either part of the world (Yeh and Kriegman, 1995) or arti®cial signs placed into the robot's environment (Kortenkamp et al., 1993). They are detected by visual sensors and then compared against the database of landmarks. In this approach no representation of the surrounding world is necessary, so we can avoid complex and often memory demanding descriptions. In this paper, we do not attempt to compare various robot navigation strategies or to propose the ``best'' landmark shape. The aim of the paper is to ®nd a set of shape descriptors, which are stable and discriminative enough for a given class of the landmarks of particular shape. We propose to apply the model based on the ane moment invariants (AMIs) (Flusser and Suk, 1993, 1994). To prove the applicability of this idea we investigated the stability and robustness of AMIs when an additive random noise or various viewing angles are present during the acquisition process. Several aspects of the use of navigation marks are mentioned in Section 2. Database of marks with shapes required by the project description is described in Section 3. The recognition model based on AMIs is introduced in Section 4. Section 5 deals with the experiments we made to test the recognition ability of the proposed model in case when additive random noise and various viewing angles are present. Section 6 concludes the described propositions and experiments. 2. Landmarks As mentioned above, one approach to navigate the mobile robot is to use landmarks. They can be a part of the environment, for example sets of vertical edges of objects located in the robot's surrounding. However, it is often much easier to use arti®cial marks placed in the environment. They are considered as a priori knowledge of the world. Particular shape of the mark is bound with information, which should be imparted to the robot (about robot's position, crossing, obstacle, special task). Distinctively shaped marks then should be situated in appropriate places. If arti®cial marks are used, we are not limited by the natural layout
of objects. Use of natural marks can be inconvenient due to the possible complexity of mark description and their detection in some tasks. Arti®cial marks can be used in many situations without problems and we can even assume that they are not occluded with an obstacle (we can choose their positions). We con®ned ourselves to the model of robot navigation problem solution with arti®cial non-occluded marks in this article. To obtain good results using the landmark approach, marks should ful®ll several conditions. They should be distinctive, meaning that they should be discriminable, and ®nding them should not be too dicult. The way of using imparted information is an important aspect of arti®cial landmarks. Marks can be used directly for robot self-localization in the world map, where the position of each landmark is recorded. On the other hand, the information can be encoded directly into the shape, for example that numbers of vertical and horizontal lines on the mark can determine the x and y coordinates of the robot position, respectively. It is necessary to design mark shapes correctly for a certain task, when using this approach. Marks should oer enough ways how to vary their shape. It is often more suitable to use information, encoded in mark shapes, as a key for the look-up tables. Then we can change data in the look-up tables in case robot's task is changed. 3. The mark set Our given mark set consists of the patterns formed by two concentric circles with equal outer and dierent inner radii (see Fig. 1). This mark set was a part of given assignment, our aim was not to discuss its suitability as landmark set or to optimize mark shape. The information which has to be imparted to the robot, can be encoded into the ratio of the mark's inner and outer radii. This way of encoding ensures the independence of the acquired information on the robot's position with regard to the mark. Marks designed in this way have one free parameter for encoding data.
B. Zitov a, J. Flusser / Pattern Recognition Letters 20 (1999) 541±547
543
proven theoretically and experimentally in our recent papers (see Flusser and Suk, 1993, 1994). The AMIs are able to correctly recognize even objects, distorted with the perspective projection u
a0 a1 x a2 y=
1 c1 x c2 y; v
b0 b1 x b2 y=
1 c1 x c2 y:
Fig. 1. Examples of navigation landmarks with dierent information encoded.
As it was mentioned above, the mark has to be easy to ®nd and dierent marks should be suciently discriminable. The given mark shape allows easy localization of the marks in the complex environment. The discriminability depends not only on the marks themselves but essentially also on the recognition model used. 4. Recognition model ± ane moment invariants We chose an invariant-base model for a marks recognition. Invariants are descriptions of such features of an object, which stay unchanged in situations, when an object is distorted. Invariants dier in the set of deformations they are invariant to. Several kinds of features have been used for object description in recent works, such as shape vectors (Peli, 1981), shape matrices (Goshtasby, 1985), Fourier descriptors (Zahn and Roskies, 1972), dierential invariants (Weiss, 1988) and moment invariants (Hu, 1962; Belkasim et al., 1991; Prokop and Reeves, 1992). Most of them can describe binary objects only and, moreover, are invariant only under translation, rotation and scaling of the object. Recently, Flusser and Suk (1993) have derived AMIs, which are invariant under general ane transformations: u a0 a1 x a2 y; v b0 b1 x b2 y;
1
(
x; y and
u; v are the coordinates in the image plane before and after the transformation, respectively). Invariance of the AMIs has been
2
A perspective transformation exactly describes the projection of a 3-D object of a general position into the 2-D image plane when captured by a pinhole camera. When the distance between the camera and the object is signi®cantly larger than the size of the object, the projective transform can be well approximated by the ane transform and the AMIs can provide a correct classi®cation of the objects. This type of AMIs stability was the reason of our decision to use them. They are robust enough even in the case of additive zero-mean random noise. These nice properties of AMIs are proven in Section 5, where the experiments that we made with the given mark database and the proposed recognition model are described. The ®rst six AMIs follow in explicit form: 1
l l ÿ l211 ; l400 20 02 1 I2 10
l230 l203 ÿ 6l30 l21 l12 l03 4l30 l312 l00 I1
4l03 l321 ÿ 3l221 l212 ; 1 I3 7
l20
l21 l03 ÿ l212 ÿ l11
l30 l03 ÿ l21 l12 l00 l02
l30 l12 ÿ l221 ; 1 I4 11
l320 l203 ÿ 6l220 l11 l12 l03 ÿ 6l220 l02 l21 l03 l00 9l220 l02 l212 12l20 l211 l21 l03 6l20 l11 l02 l30 l03 ÿ 18l20 l11 l02 l21 l12 ÿ 8l311 l30 l03 ÿ 6l20 l202 l30 l12 9l20 l202 l221 12l211 l02 l30 l12 ÿ 6l11 l202 l30 l21 l302 l230 ; 1 I5 6
l40 l04 ÿ 4l31 l13 3l222 ; l00 1 I6 9
l40 l04 l22 2l31 l22 l13 l00 ÿ l40 l213 ÿ l04 l231 ÿ l322 ;
544
B. Zitov a, J. Flusser / Pattern Recognition Letters 20 (1999) 541±547
where lpq is the central moment of order
p q. For 2-D object A it is de®ned as ZZ lpq
p
q
x ÿ xt
y ÿ yt f
x; y dx dy;
3
A
where
xt ; yt are the co-ordinates of the center of gravity of object A and f
x; y describes an intensity distribution within A. Full derivation of the AMIs can be found in (Flusser and Suk, 1993). Some general remarks about AMIs should be made here. It can be said that moments of higher orders describe more subtle variations in shape and are more sensitive to noise corruption. The second important thing about them is that AMIs I2 , I3 and I4 (mentioned above) have theoretically zero-values in the case of radially symmetric objects. This is caused by the fact that the value of lpq is zero, when p or q are odd. Thus, I2 , I3 and I4 are not suitable to use for the recognition of the given landmarks. Generally, the moments and the AMIs are de®ned for gray-level images with a ®nite support without any other restrictions. In our case, however, the images of the landmarks are binarized before calculating the invariants in order to eliminate dierent lighting conditions. Thus, f
x; y in
de®nition (3) represents the characteristic function of the landmark picture. We proposed a recognition model, which for a given image of a mark, acquired by the robot sensors, computes the AMIs I1 , I5 and I6 , and by the minimum distance algorithm ®nds the closest mark from the database. The ability of this recognition model to make correct decisions in situations of non-zero viewing angle (projective transform) or the presence of additive random zero-mean noise in the acquisition process is investigated experimentally in Section 5.
5. Numerical experiments The discriminability of the AMIs was tested on images degraded by additive noise and projective transform. The experimental setting is shown in Fig. 2. AMIs were computed from the degraded images of the mark with the ratio 1/2 of its inner and outer radius. The ®rst experiment deals with the recognition of noisy marks. Mark image was corrupted by additive random zero-mean noise with a uniform distribution. The camera viewing angle was ®xed at 0°, the values of the added noise standard deviation (STD) were from the interval 0±250. The
Fig. 2. Experiment setting: the image of the mark was taken with dierent camera viewing angles (an example of two camera positions).
B. Zitov a, J. Flusser / Pattern Recognition Letters 20 (1999) 541±547
545
distance of the camera from the mark was 100 pixels, the camera focal length was 70 pixels and the outer mark radius was 100 pixels. The minimum distances between the acquired mark images and corresponding reference mark from the database were computed in the feature space (I1 , I5 , I6 ). Computed values are shown in Fig. 3. The ®rst misclassi®cation occurred with STD 232. The proposed recognition model (AMIs) showed suf®cient stability in noisy conditions. Even with high values of STD the distance to the corresponding reference mark is much smaller than distance to other reference marks. To verify the stability of the AMIs recognition model in real situations, the following experiment was carried out. The landmark with the outer radius 5 cm and the inner radius 2.5 cm was placed in the indoor environment (see Fig. 4). Several images of this scene were acquired from dierent viewpoints. The distance between the camera and the mark was 1.8 m and 1.0 m, respectively. The camera viewing angles grew from 0° to 80° with steps of 5°. The camera de¯ection introduced projective deformations ± the landmark in Fig. 4 seems to have an ellipse-like shape. Each image was binarized and the landmark was extracted
Fig. 4. The real indoor scene used in the second experiment. Note the projective deformation of the landmark due to a nonzero viewing angle.
Fig. 3. Distance between noisy images and the original image: acquired images dier in the STD value of the added noise. Horizontal axis: standard deviation of uniform noise of corresponding image (0±250); vertical axis: the distance (mean value over 20 runs) between noisy images and the original image measured in 3-D Euclidean space of the AMIs I1 , I5 and I6 .
from the scene. Then each landmark picture was classi®ed by the minimum-distance rule in 3-D Euclidean space of the invariants I1 ; I5 and I6 with respect to the reference landmark database. Values of I1 , I5 and I6 , corresponding to the reference landmarks with radii ratio 2/5, 1/2 and 3/5 are shown in Table 1. Fig. 5 shows the distances between the reference marks and the images, acquired with the constant camera distance 1.0 m and from dierent viewing angles. It can be clearly seen that the radii ratio of the test landmark is always correctly recognized as 1/2 regardless of the
546
B. Zitov a, J. Flusser / Pattern Recognition Letters 20 (1999) 541±547
Table 1 Values of I1 , I5 and I6 , corresponding to the reference landmarks with radii ratios 2/5, 1/2 and 3/5, respectively Radii ratio
I1
I5
I6
2/5 1/2 3/5
121 176 287
61 116 285
287 768 2934
images leads to good results. Even high values of STD had a rather small in¯uence on the recognition results of AMIs. The presented experiments proved the possibility of using the AMIs for the recognition of landmarks of the given particular shape. Acknowledgements Part of this work was supported by the grant No. 102/96/1694 of the Grant Agency of the Czech Republic. Experiments with the mobile robot were carried out at the Center of Machine Perception, Czech Technical University, Prague.
References
Fig. 5. Distances between acquired images and reference images in
I1 ; I5 ; I6 space. Acquired images dier from each other in the viewing angle. Horizontal axis: the viewing angle (0±80°, step 5°); vertical axis: the distance between acquired images and the database landmarks of the radii ratio 1/2 (depicted as ), the radii ratio 2/5 () and the radii ratio 3/5 (+), respectively.
viewing angle. This illustrates the sucient stability of the AMIs. 6. Conclusions In this paper, a method for the recognition of circular navigation landmarks was introduced. This model is based on AMIs. The performance of the proposed method was demonstrated by experiments on computer-modeled and real data, respectively. Although the AMIs are invariant under the ane transform, which is only approximation of the projective transform occurring in the robot vision system, their recognition ability is high enough in that case too. The stability of AMIs under random zero-mean noise present in the acquisition process of mark
Belkasim, S.O., Shridhar, M., Ahmadi, M., 1991. Pattern recognition with moment invariants: A comparative study and results. Pattern Recognition 24, 1117±1138. Brooks, R.A., 1986. A robust layered control system for a mobile robot. IEEE Trans. on Robotics and Automation 2 (1), 14±23. Cox, I.J., Wilfong, G.T., 1990. Autonomous Robot Vehicles, Springer, New York. Dulimarta, H.S., Jain, A.K., 1997. Mobile robot localization in indoor environment. Pattern Recognition 30 (1), 99±111. Flusser, J., Suk, T., 1993. Pattern recognition by ane moment invariants. Pattern recognition 26, 167±174. Flusser, J., Suk, T., 1994. A moment based approach to registration of images with ane geometric distortion. IEEE Transaction on Geoscience and Remote Sensing 32 (2), 382± 387. Goshtasby, A., 1985. Description and discrimination of planar shapes using shape matrices. IEEE Trans. Pattern Anal. Mach. Intell. 7, 738±743. Hu, M.K., 1962. Visual pattern recognition by moment invariants. IRE Trans. Inf. Theory 8, 179±187. Kortenkamp et al., 1993. Integrated mobile-robot design. IEEE Expert, August 1993, 61±73. Kosaka, A., Kak, A.C., 1993. Fast vision-guided mobile robot navigation using model-based reasoning and prediction of uncertainties. CVGIP: Image Understanding 56 (3), 271± 329. Leonard, J.J., Durrant-Whyte, H.F., Cox, I.J., 1990. Dynamic map building for an autonomous mobile robot. IEEE International Workshop on Intelligent Robots and Systems, pp. 89±95. Lewitt, T., Lawton, D., Cheldberg, D., Nelson, P., 1987. Qualitative navigation. In: Proceedings Image Understanding Workshop, pp. 447±465.
B. Zitov a, J. Flusser / Pattern Recognition Letters 20 (1999) 541±547 Moravec, H.P., 1981. Robot Rover Visual Navigation. UMI Research Press, Ann Arbor, MI. Peli, T., 1981. An algorithm for recognition and localization of rotated and scaled objects. Proc. IEEE 69, 483±485. Prokop, R.J., Reeves, A.P., 1992. A survey of moment-based techniques for unoccluded object representation and recognition. CVGIP: Graphical Models and Image Processing 54, 438±460. Shah, S., Aggrawal, J.K., 1995. Modeling structured environments using robot vision. In: Proceedings of 1995
547
Asian Conference on Computer Vision, Singapore, pp. 297±304. Weiss, I., 1988. Projective invariants of shapes. In: Proceedings on DARPA Image Understanding Workshop, Cambridge, MA, pp. 1125±1134. Yeh, E., Kriegman, D.J., 1995. Toward selecting and recognizing natural landmarks. Tech. Report 9503, Yale University. Zahn, C.T., Roskies, R.Z., 1972. Fourier descriptors for plane closed curves. IEEE Trans. Comput. 21, 269±281.