Efficiently Computing Optimal Consensus of Digital ... - Semantic Scholar

Report 2 Downloads 92 Views
2010 International Conference on Pattern Recognition

Efficiently computing optimal consensus of digital line fitting Yukiko Kenmochi Lilian Buzer Hugues Talbot Universit´e Paris-Est, Laboratoire d’Informatique Gaspard-Monge, Equipe A3SI, France {y.kenmochi, l.buzer, h.talbot}@esiee.fr complexity O(N 2 ), without compromising space complexity.

Abstract—Given a set of discrete points in a 2D digital image containing noise, we formulate our problem as robust digital line fitting. More precisely, we seek the maximum subset whose points are included in a digital line, called the optimal consensus. The paper presents an efficient method for exactly computing the optimal consensus by using the topological sweep, which results in an algorithm with quadratic time complexity and linear space complexity with respect to the number of input points.

II. T HE PROBLEM OF DIGITAL LINE FITTING A line in the Euclidean space R2 is defined by L = {p ∈ R2 : n · p + b = 0}

Keywords-digital line; fitting; optimal consensus;

where n = (a, 1) or (1, a), a, b ∈ R and −1 ≤ a ≤ 1. This line model is continuous since a line is regarded as a set of points in R2 . However, our input considered in this paper is a digital image, so that coordinates of given points are integers. It means that in practice, we may rarely have points exactly on L due to image digitization. In order to avoid such a problem, we use a digital model, called a digital line, which allows us to treat a line as a set of discrete points in Z2 instead, where Z is the set of all integers. A digital line, which is the digitization of L, is defined by the set of discrete points satisfying two inequalities:

I. I NTRODUCTION Straight line fitting is an essential task in the field of computer vision. The procedure is used extensively in image understanding and 3D reconstruction. The problem can be viewed as a parameter estimation method [1]. Regarding optimal estimation, most commonly used methods derive from a continuous line model, which is defined as a set of Euclidean points satisfying a linear equation (see (1) in the next section). Fitting is typically carried out through optimizing various cost functions. For instance, least-square fitting minimizes the sum of geometric distances from all given points to the model, while least-absolute-value fitting uses the vertical distances [2]. However, these are not robust to the presence of outliers. Conversely, Least Median of Squares regression, which minimizes the median of the vertical/geometric distances, is robust as long as fewer than half of the given points are outliers [3]. In order to process data when more than half of the points are outliers, we had introduced earlier a different optimization problem using a digital line model, which is defined by a set of discrete points satisfying two linear inequalities (see (2) in the next section) [4]. Given an arbitrary cloud of discrete points, our problem is to find the maximum subset whose points are included in a digital line, called the optimal consensus. In order to solve this combinatorial optimization problem, we focused on a geometric property of consensus sets for digital lines, implying that all consensus sets can be generated from a given point set. This then allowed us to develop an algorithm that finds the optimal solution with time complexity O(N 2 log N ) and space complexity O(N ), where N is the number of points. In this paper, we make a different geometric interpretation so that the problem turns out to be an application of line arrangements in computational geometry [5]. We use the topological sweep [6], which now provides us with time 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.266

(1)

D(L) = {p ∈ Z2 : 0 ≤ n · p + b < w}

(2)

where w is a given constant value [7]. Geometrically, D(L) is a set of discrete points lying between two parallel lines n · p + b = 0 and n · p + b = w, and w specifies the vertical or respectively horizontal distance between these lines, depending on whether n = (a, 1) or (1, a). From the digital geometrical viewpoint [7], [8], w should not be less than 1, if we expect that D(L) is 8-connected. In other words, 1 is the minimum distance to keep the connectivity of a digital line. Using the above digital line model, our fitting problem is then described as follows: given a finite set of discrete points such that S = {pi ∈ Z2 : i = 1, 2, . . . , N }, we seek the maximum subset of S whose elements are contained by some D(L). Note that points pi ∈ S are called inliers for D(L) if pi ∈ S∩D(L); otherwise, they are called outliers. By using this term, our problem is also described as finding the maximum inlier set in S, called the optimal consensus of digital line fitting. Figure 1 (left) depicts an example of digital line fitting. 1068 1064

space. Conversely, a point     in the primal space is associated with the line               

Figure 1. Digital line fi tting (left) and its equivalent problem of stabbing line segments (right).

in the dual space. Note that, in this sense, with respect to the two cases for     and  , we need to consider the two duality transforms. Since one of them is easily obtained from the other by simply switching  and , we only consider the case of     hereafter. Each line segment  ,   , is thus regarded as the region between the two parallel lines that correspond to the two endpoints of  , namely,  and   , as illustrated in Figure 2. This region is called the strip, defi ned by                  Note that  includes only  but not   . In order to distinguish these two types of lines, we call  and   the lower and upper lines respectively. Recall that a point   in the dual space corresponds to the line  in the primal space. Then, if the point is included in a strip  in the dual space, it implies that the line  stabs the line segment   in the primal space. In fact, a set of strips  for all    divides the dual space as illustrated in Figure 2, and each divided region  corresponds to each consensus set  of digital line fi tting for such that

Figure 2. Each line segment in Figure 1 (right) is associated with a strip in the dual space. Regions divided by all strips for  correspond to all consensus sets of digital line fi tting for.

III. T HE PROBLEM AS STABBING LINE SEGMENTS A. The problem in the primal space

        

In this section, we show that our digital line fi tting ')&#$ %  &%*!) * *) !%  #!% + + !* *+D bing the maximum number of line segments. A geometric interpretation will be simply given as follows; we give a distance  for each point    rather than for , and then reconsider our problem. For each point   , we fi rst consider the vertical or horizontal line segment of length  such that

Our problem can then be considered in the dual space, given a set of strips  for all   , to fi nd the divided region  that is covered by the maximum number of  . Remark that if a region  is not covered by any  , its consensus set  is empty. The number of divided regions is at most     , among which there may be many empty regions.

             

IV. T OPOLOGICAL SWEEP FOR DIGITAL LINE FITTING

where    or  , depending on whether     or   for  in (2). Note that   has two endpoints  and  , but contains only  . Given all  s for   , we then look for the maximum subset of  s that are stabbed by a line . Figure 1 (right) illustrates stabbing line segments, which is equivalent to the problem in Figure 1 (left). As seen in Figure 1 (right), an infi nite number of lines can stab the maximum number of line segments  . In order to effi ciently obtain the set of such optimal stabbing lines, we need to consider our problem in the dual space of the duality transform [6].

In order to effi ciently search the optimal divided region in the dual space, we use the topological sweep algorithm for an arrangement of lines [6], whose time complexity is    and space complexity is  . Note that we have  lines instead of  in this paper, which does not change the complexities. Indeed, the problem of stabbing line segments is treated as an application of the topological sweep in [6]; however, in their article, line segments may not be vertical in the primal space, so that their dual )')*%++!&% &$*  %&%D%**)!#0 ')### '!) & lines. In our problem, every line segment  ,   , has a dual representation consisting of two parallel lines,  and   , which are the boundary of each strip  . We show in this section that our problem can be solved in the similar manner. The topological sweep algorithm uses a topological line, called a cut, for sweeping the arrangement of a given set of

B. The problem in the dual space We use (1) for the transform between the primal space   and the dual space  , so that a line  in the primal space is associated with the point   in the dual 1065 1069

Figure 5.

etry framework. For degenerate cases, we use the modifi ed algorithm [9], with the same complexity as the original &%=   &%*%*,*D*+ *!1 ')&'+!&% + +  !%+)*+!&% of multiple lines, as illustrated in Figure 5, is realized as &##&.*= ,''&* + + +  &%*%*,*D*+ *!1 & +  ,))%+ region  is  , and that there are  upper lines and  lower lines in the current cut above , and  upper lines and  lower lines below =  %? +  &%*%*,*D*+ size of the new region  is obtained by

!,) 5= %!+!# ,+ & +  +&'&#&!# *.' % &%*%*,*D*+ *!1 &) each region.

(a)

An example of degenerate cases: multiple concurrent lines

(b)

          

(c)

(d)

which is the general formulation for both degenerate and %&%D%)+ **= !% . ) !%+)*+ !% +  )!&%* satisfying  , we keep  only for those regions.

!,) 6= &,) ** &) +  &%*%*,*D*+ *!1 ')&'+!&% & )!&%* ,)!% +  +&'&#&!# *.'@ &++ % %&%D&++ #!%* )')*%+ ,'') and lower lines.

V. E XPERIMENTS

lines in the plane. A cut is considered as a monotonic line in the vertical direction, which intersects each line exactly once, and specifi ed by a sequence of intersecting lines. A sweep starts with the leftmost cut, whose line sequence is in ascending order of their slopes, and pushes it to the right until it becomes the rightmost cut, whose line sequence is in descending order of their slopes. A cut is pushed by passing an intersection of lines when they are consecutive in the sequence of the current cut. See [6] for more details. To solve our problem, we need to visit all regions divided by all lines  and   ,   , and count for each region the number of strips  that cover the region. As mentioned above, we have two different types of lines for   ,  and   , called lower and upper lines. The initial cut has a sequence of  lines, whose two consecutive lines, with the same slope are in descending order of their intercepts. We also have a sequence of    regions, each of which is bounded by a pair of its &%*,+!- #!%*? % + !) &%*%*,*D*+D*!1 *(,% !*         , as illustrated in Figure 3. As pointed out in [6], such consensus set size  can be propagated in constant time from one region  to the next  during the topological sweep, as shown in Figure 4. Our input  is a set of discrete points, whose coordinates are all integers, so that all computations can be done exactly using only integers or rational numbers. While desirable, this makes degenerate cases such as parallel lines and multiple &%,))%+ #!%* $&) #!"#0 + % !% +  &$',++!&%# &$D 1066 1070

We carried out experiments on a    image of digitized lines with   , while adding or deleting  points randomly. The fi nal number of input points was  , as shown in Figure 6 (a).   ')&'&* $+ & )&, + ,* +  +.& &'+!$# &%D sensus sets of digital line fi tting, each of which contains   inliers, illustrated as the red points in Figure 6 (b) and (c), respectively. Note that we need to apply the method twice with respect to the two cases     and  ; Figure 6 (b) is obtained for the fi rst case while Figure 6 (c) is obtained for the second case. Our implementation .* * &% +  JJ ')&)$ H3