Multiscale Surface Organization and Description ... - Semantic Scholar

Report 1 Downloads 40 Views
Multiscale Surface Organization and Description for Free Form Object Recognition Kim L. Boyer and Ravi Srikantiah Signal Analysis and Machine Perception Laboratory The Ohio State University [email protected]

II.

Abstract We introduce an efficient, robust means to obtain reliable surface descriptions, suitable for free form object recognition, at multiple scales from range data. Mean and Gaussian curvatures are used to segment the surface into four saliency classes of based on curvature consistency as evaluated in a robust multivoting scheme. Contiguous regions consistent in both mean and Gaussian curvature are identified as the most homogeneous segments, followed by those consistent in mean curvature but not Gaussian curvature, followed by those consistent in Gaussian curvature only. Segments at each level of the hierarchy are extracted in the order of size, large to small, such that the most salient features of the surface are recovered first. This has potential for efficient object recognition by stopping once a just sufficient description is extracted. I.

Introduction

We have built a multiscale recognition system that describes objects at successively higher resolutions until a suitable degree of discrimination is obtained1 . This paper focuses on the surface organization technique. Mokhtarian et al. [1] and Zhang and Hebert [2] propose multiscale description techniques, both of which are computationally intensive. We propose a (pseudo)multiscale analysis based on curvature consistency criterion. Most prior work in region-based range image segmentation, such as [3, 4, 5, 6, 7, 8], endeavors to classify surface regions into canonical types. The notion of curvature consistency has been explored by Sander and Zucker [9] and Ferrie, et al. [10]. Our approach differs in that we simply group contiguous patches of the surface having consistent curvature characteristics. This lends a measure of robustness and stability not heretofore attainable with free form objects. Attempts to segment these surfaces using standard models or surface types often leaves large regions either unmodeled, or broken into many small pieces conveying little or no real information about the surface. 1 NB: The resolution of the range image is fixed; it is the resolution of

our description of the surface that is adjusted.

Patrick J. Flynn Department of Computer Science and Engineering University of Notre Dame fl[email protected]

Surface Organization

Our algorithm extends the voting technique developed by Wuescher and Boyer [11] for extracting constant curvature segments in 2D edge maps. We partition the surface into four types of primitives in descending order of constraint and saliency: 1. Segments having (nearly) constant mean and Gaussian curvatures ( – segments). These are the most homogeneous and, therefore, most salient of the segment classes, for a given size. 2. Segments of (nearly) constant mean curvature only ( – segments). These are the next most salient segments. 3. Segments of (nearly) constant Gaussian curvature only ( – segments). The saliency of these segments is essentially identical to those of constant mean curvature, except that the calculation of mean curvature is less noise-sensitive and, therefore, more reliable. 4. Segments for which both curvatures change rapidly ( – segments). These segments aren’t homogeneous; they are the “leftovers.” For a given hierarchical level, significance is ranked in decreasing order of segment size, i.e., largest segment first. A. Curvature Voting:  -Consistency

With the mean and Gaussian curvatures ( and  , respectively) calculated at each surface point, we next turn our attention to extracting contiguous regions that are consistent in both these measures. We begin by defining tolerance windows  and  for the two curvatures such that, if:

j (p1 ) ;  (p2 )j   and j (p1 ) ;  (p2 )j  

(1) then the points p1 and p2 on the surface may be grouped into the same constant curvature bin (a  –bin).

1051-4651/02 $17.00 (c) 2002 IEEE

The grouping process begins with a two-dimensional multivoting technique. Each point on the surface casts votes for all bins bij in the quantized 2D space B 2 defined by 

;



B2 =  ;   =  i;  = j (2) with  2 R, 2 R, i 2 Z , j 2 Z and satisfying the i

j

i

j

constraints





j (p) ;  j   and  (p) ;    i

j

(3)

where  and are the quantization bin sizes of the mean and Gaussian curvatures respectively. Each vote is tagged with the spatial location of the corresponding point. When all the votes are tabulated, the result is a form of two-dimensional histogram, built with overlapping votes. The peaks in this histogram represent the combined mean and Gaussian curvature values most likely to form the largest segments under the combined consistency constraint, although many small segments could generate more votes in one bin than a single large segment in another. Therefore, we sequentially extract the segments according to size. B. Sequential Segment Extraction Working with the 2D histogram, ; we begin  with the bin b =  ;  , covering the receiving the most votes, ij 

i j ;  range (i   ) and  j   . We form groupings of contiguous sets of points casting votes into this bin and identify the largest such set. We examine the segments sequentially, always retaining the identity of the current largest segment. Once the point count of the segments remaining to be considered in the bin is less than the size of its current largest segment, we can declare it to be the largest segment in this bin, although not necessarily in the data set. Therefore, after finding the largest segment in the most populous bin, we extract the largest segment in the next most populous bin, retaining it if it is larger than the largest segment from the first bin. This procedure continues until the size of the largest recovered segment over all examined bins exceeds the number of votes in the largest remaining bin. At this point, it will be impossible to find a segment in that or any subsequent bin larger than the current largest segment. The points on the surface corresponding to this segment are labeled and deleted from the surface, their votes are removed from the histogram, and the process repeats from the beginning with the now-modified histogram. This process continues until there are no longer any contiguous sets of points that satisfy the consistency constraint (Eq. 1) and which are larger than a prespecified minimum area, Amin . This minimum area determines the smallest constant curvature segment we will extract, and is chosen according to the nature of the problem domain.

Extracting  and Segments On their own, regions consistent in both curvatures are insufficient to describe most realistic objects. Therefore, the next step is to extract those regions consistent in either mean or Gaussian curvature alone. As we have described, these regions are less salient than the  segments, but are still highly valuable for recognition. The  segments, having consistent mean curvature and more noise immunity, are extracted first, followed by the segments (consistent Gaussian curvature). The algorithm for each is the same, and is simply a one-dimensional version of the twodimensional voting and extraction scheme described for the  segments.

C.

D. Completing the Process:  Segments There will remain portions of the surface not assigned to any segment. Those of area greater than Amin are labeled as  segments, indicating that both curvatures are changing rapidly over this region. nevertheless, we do use these segments in the surface description. Segments smaller than Amin are not useful descriptors of the surface; they are simply merged with the surrounding extracted segments. E. Algorithm Parameters The algorithm’s parameters are listed below, with their respective impacts on the grouping result.



Scale of Gaussian smoothing ( ): Not critical, because we compute curvature analytically from polynomial fits.



Curvature tolerances ( and  ): These control the scale of the surface organization.



Quantization bin sizes (  and ): Not critical;  is a good choice for both curvatures. 5



=

Minimum segment size (Amin ): The lower limit on the size of any segment of interest.

III.

Experimental Organization Results

We present our segmentation results in two parts. We first analyze the effect of each parameter on segmentation. Next, we present “standard” segmentation results for several objects to demonstrate the stability of the algorithm with respect to various surface types. These are but a small sample of typical results from a large experimental base. In the images, color (or grayscale) is used only to distinguish regions; there is no other information encoded in the color scheme. These experiments use data of roughly 0.5 mm resolution on objects of 70-125 mm in length. We begin by choosing a set of standard parameter values (Table 1). Without a ground truth decomposition, we

1051-4651/02 $17.00 (c) 2002 IEEE

Parameter

    Amin

Standard Value

5  10;3 5  10;4 25  10;3 25  10;4 2 500

Units

Scale

mm;1 mm;2 mm;1 mm;2 mm mm2

Table 1: Standard parameters. based these parameters on subjective assessment over a fairly wide range of surfaces. In Fig. 1 we illustrate the effect of (roughly) 50% changes in the tolerance intervals. The standard result is at the upper right (up rgt). Changing the tolerance windows produces a gradual change in detail, as we expect. Smaller window sizes impose tighter homogeneity constraints, producing smaller segments with more detail. As the tolerance levels are increased, details gradually disappear and segments begin to merge to yield larger, more salient segments. We use this as a tuning parameter, together with minimum segment size, as shown below to implement a (pseudo) multiscale segmentation. In studies not shown for lack of space, we found the algorithm remarkably stable to changes in the other parameters, even over 3:1 ratios. We present segmentation results in Figs. 2 and 3 for various objects using the standard parameters. In Fig. 2 we have: 1. Apple: Upper left. A nearly perfect segmentation. 2. Dumbbell: Upper right. Made from children’s modeling clay; the irregular segments in the bulbs correspond to actual shallow depressions in the body. 3. Whale: Lower left. Bottom view reveals smooth transition to concave belly; fins to either side. 4. Dinosaur: Lower right. A series of stable spikes are present along the length of its back and head.

0 1 2 3 4 5

 40  10;3 35  10;3 30  10;3 25  10;3 20  10;3 15  10;3

 Amin ; 4 40  10 800 35  10;4 700 30  10;4 600 25  10;4 500 20  10;4 400 15  10;4 300

Table 2: Table of parameters by scale. Parameters not specified take on the standard values. IV.

Surface Description and Object Recognition

We lack the space to describe the recognition system in full. It is an attributed graph approach that represents surface segments as nodes and spatial relationships as arcs. Node attributes include the curvatures, area, saliency, location, and orientation; arc attributes include relative orientation and distance. The modelbase is constructed at multiple scales (Table 2) and recognition begins by segmenting the scene first at a coarse scale, and then successively reducing the scale until a suitable match is recovered. Recognition is accomplished by constructing an inexact subgraph isomorphism between the object model graph and that of surface primitives in the scene. The matching score is based on weighted error measures in the unary and binary attributes, and on the fraction of the scene surface not covered by the matching. The search terminates when enough of the scene is explained, with suitably low error or ambiguity, or when the nodes exhaust. The multiscale approach thus identifies a “natural” scale at which the object’s match error drops and a substantial fraction of its visible surface is explained. There may be more than one such scale for some objects. In 14 tests on a database of 10 freeform objects (those above, and others similar), we have achieved error free recognition. We are hardly naive enough to tout our approach as error-free, given the simplicity of the experiment, but it is clear that this surface organization and description formalism shows considerable promise in the difficult domain of free form object recognition.

And in Fig. 3: 1. Lizard: Left. Large featureless region broken by  segments near the tail; more stable about the arms and mouth. 2. Lamb: Right. Bottom view; legs and belly significant, negatively curved region near chin extracted. Some  regions appear in the transitions.

V.

Closing Remarks

We have introduced a novel means to obtain reliable surface descriptions from range data for the modeling and recognition of free form objects. The organization is robust, yet tunable for multiscale use. Unlike traditional methods in which surface types (models) are imposed, we achieve concise descriptions of very general, free form surfaces. We extract surface segments sequentially in a partial ordering by saliency. This allows the embedding of surface description within the object recognition loop.

1051-4651/02 $17.00 (c) 2002 IEEE

VI.

References

[1] F. Mokhtarian, N. Khalili, and P. Yuen, “Multi-scale freeform surface description,” Proc. Indian Conference on Computer Vision, Graphics, and Image Processing, pp. 70–75, 1998. [2] D. Zhang and M. Hebert, “Multi–scale classification of 3d objects,” IEEE Conf. Computer Vision and Pattern Recog., pp. 864–869, June 1997. [3] M. Hebert and J. Ponce, “A new method for segmenting 3-d scenes into primitives,” Proc. Int. Conf. Pattern Recognition, pp. 836–838, October 1982. [4] M. Oshima and Y. Shirai, “Object recognition using threedimensional information,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-5, pp. 353–361, July 1983. [5] P. Besl and R. Jain, “Segmentation through variable order surface fitting,” IEEE Trans. Pattern Anal. Machine Intell., vol. 10, no. 2, pp. 167–192, 1988. [6] B. C. Vemuri and J. K. Aggarwal, “Curvature-based representation of objects from range data,” Image Vision Computing, vol. 4, no. 2, pp. 107–114, 1986.

Figure 1: Multiscale segmentation: (up lft)  = 15e;3,  = 15e;4, (up rgt)  = 25e;3,  = 25e;4, (lo lft)  = 35e;3,  = 35e;4

[7] R. M. Haralick, L. T. Watson, and T. J. Laffey, “The topographic primal sketch,” Int. J. Robotics Res., vol. 2, no. 1, pp. 50–72, 1983. [8] P. J. Flynn and A. K. Jain, “Surface classification: Hypothesis testing and parameter estimation,” Proc. IEEE Conf. Comput. Vision and Patt. Recog., pp. 261–267, June 1988. [9] P. Zander and S. Zucker, “Inferring differential structure from 3-D images: smooth cross sections of fiber bundles,” IEEE Trans. PAMI, vol. 12, pp. 833–854, 1990. [10] F. P. Ferrie, S. Mathur, and G. Soucy, “Feature extraction for 3-D model building and object recognition,” in ThreeDimensional Object Recognition Systems (A. K. Jain and P. J. Flynn, eds.), pp. 57–88, Elsevier, 1993. [11] D. M. Wuescher and K. L. Boyer, “Robust contour decomposition using a constant curvature criterion,” IEEE Trans. Pattern Anal. Machine Intell., vol. 13, pp. 41–51, January 1991.

Figure 2: Standard segmentation results. Figure 3: More standard results.

1051-4651/02 $17.00 (c) 2002 IEEE