2011 18th IEEE International Conference on Image Processing
NONPARAMETRIC POLYGONAL AND MULTIMODEL APPROXIMATION OF DIGITAL CURVES WITH RATE-DISTORTION CURVE MODELING Alexander Kolesnikov School of Computing University of Eastern Finland, Box 111, FIN-80101 Joensuu, FINLAND
[email protected] represent a curve segments with the model. In such cases, the description length of an approximation model is defined in terms of the amount of data that is needed to describe the model [4]. By analogy with the polygonal approximation, we can thus formulate min-# and min-ε problems for multimodel approximation. We can therefore find an approximation if an input parameter (i.e. with the number of segments or the error threshold) is known. Before we perform an approximation, we therefore need to answer to the following question: What is the most suitable value of the input parameter? What is needed therefore is a method that determines the input parameter of the approximation algorithm for a concrete curve. One possible approach to this problem is to find a solution that offers the best balance between the number of the segments M and the Integral Square Error (ISE) E2. This can be done by introducing a cost function that incorporates the number of segments and approximation error. In Lagrange multiplier algorithm, we are searching for solution with minimal value of the additive cost function C=E2+λM [5]. The trade-off between the number of segments and approximation error is controlled by means of the user-defined Lagrange multiplier λ. In [6], a multiplicative cost function Figure of Merit (FOM) was introduced for evaluating solutions that had been obtained by using heuristic algorithms: FOM=E2·M.
ABSTRACT How many linear segments are sufficient to represent a shape? In this paper we consider the problem of the nonparametric polygonal and multimodel approximation of digital curves. In order to solve the problem, we propose algorithm that is based on the parameterized model of the Rate-Distortion curve and the multiplicative cost function. By analyzing the minimum of the cost function, we define a solution that produces the best possible balance between the number of segments and the approximation error. The algorithm performed well as it produced the relevant polygonal approximation and two-model approximation (with linear segments and circular arcs). Index Terms — Shape; Graphical model; Piecewise linear approximation; Curve fitting. 1. INTRODUCTION The approximation of digital curves with line segments is used in image processing, computer graphics, shape analysis and encoding, as well as in digital cartography to where it is used to reduce the description of digital curves that need to be processed, stored and transmitted. The problem of the polygonal approximation of digital curves is usually formulated in either of the following two forms: 1) min-# problem: With a given constraint on the approximation error, approximate a curve with the minimum number of line segments M; and 2) min-ε problem: With a given the number of approximating line segments M, approximate the input curve with a minimum approximation error. These problems that arise in error measure L2 can be solved by means of optimal and near-optimal methods [1-3]. In some cases, more relevant model (circular or elliptical arcs, polynomials, splines) or set of the models can be used to represent shapes that consist of fewer segments. But wherever the model is more sophisticated, more data needs to be stored or transmitted in order to
978-1-4577-1302-6/11/$26.00 ©2011 IEEE
Figure 1. Set of the test curves: a) shape #1 [11], N=2205; b) shape #2, N=6603; c) shape #3 [8], N=428; d) shape #4 [10], N=746; e) shape #5 [10], N=1042; f) shape #6, N=939.
2946
2011 18th IEEE International Conference on Image Processing
Figure 2. Rate-Distortion curves for the shapes #1-5 for polygonal approximation (left) and for the shape #6 for two-model approximation with linear segments and circular arcs (right). The coefficient a is larger for more smooth curves.
of the approximation errors for all the approximation segments [1]. In the case of a multimodel approximation, we are given the set of K approximation models (functions) Φ={ϕ1, …, ϕK}. The input curve P is divided into segments, and each segment is approximated by a function ϕk from the set Φ. The description length, rk(im, im+1), for the segment is defined as a data capacity (the number of parameters) that are available to describe the segment with the model ϕk. The total description length (L) is obtained from the sum of the description lengths of the segments. The problem of multimodel approximation can thus be solved with Dynamic Programming algorithm [4]. To construct the Rate-Distortion (R-D) curve for the input curve P, we have to calculate the approximation errors E2(M) for M∈[M1,M2] with the Dynamic Programming algorithm for polygonal [1-3] or for multidimodel [4] approximation, correspondingly. The R-D curves (RDC) for the test data are presented in Fig. 2. The R-D curves can be approximated by using the following model:
Rosin [7] criticized the criterion that was used on the grounds that the two terms, M and E2, in the FOM are not balanced. In the modified criterion, FOM-n, that is proposed in [8], the number of segments M was penalized by raising it to a power that would have the effect of reducing this bias: FOM-n=E2·Mn, where n=2 or 3. The similar criterion, relative error, was proposed earlier in [9] as follows: Er = M E2 .
While it was originally created as a criterion for evaluating heuristic algorithms, the criterion FOM-n can also be used to define the most suitable input parameter. In [10], the heuristic methods were used for analyzing the FOM-2 curve so that the value of the input parameter M could be determined for so called reference solution. In this paper, we offer a direct method for approaching a problem of nonparametric approximation. At first, we construct a parameterized model of the Rate-Distortion. Then, we derive cost function from the R-D curve model and use minimum(s) of the cost function to define the optimal number of approximation segments. 2. NONPARAMETRIC ALGORITHM DIGITAL CURVES APPROXIMATION
FOR
(
)
lg Eˆ 2 ( M ) = b1 − a ⋅ lg(M ) or Eˆ 2 ( M ) = b2 / M a .
2.1. Modeling the Rate-Distortion Curve
(1)
Here the R-D curve parameter a may be found by using the following:
Let us consider an open planar N-vertex curve P={p(1), …, p(N)}, where p(n)=(x(n), y(n)). The polygonal curve P is approximated by another polygonal curve Q={q(1), …, q(M+1)} with M line segments, where q(m)=p(im). The approximation error with the measure L2 for a curve segment is defined as the sum of squared distances, the distance between a point p(n) of the curve segment {p(im), …, p(im+1)}, and the approximation line defined by the points p(im) and p(im+1). The Integral Square Error E2(M) for the curve P is defined as the sum
( M 2 − M 1 + 1) a=
∑
M2 M1
∑ lg(m)∑ (m) − ∑ lg(m)
lg(m) lg( E 2 (m)) −
( M 2 − M 1 + 1)
∑
M2 M1
lg 2
M2
M2
M1
M1 2
M2
lg( E 2 (m)) .
(2)
M1
Value of the coefficient a depends on the range [M1,M2] of the parameter M. We can control the process by selecting the range.
2947
2011 18th IEEE International Conference on Image Processing
a)
b)
c)
d)
e)
f)
Figure 3. Scaled FOMs for: a) polygonal approximation of the shape #1, Mopt=28; b) polygonal approximation of the shape #2, Mopt=15; c) polygonal approximation of the shape #3, Mopt=35; d) polygonal approximation of the shape #4, Mopt=22; e) polygonal approximation of the shape #5, Mopt=21; f) two-model approximation of the shape #6, Lopt=45 (3 linear segments and 13 circular arcs).
global minimum of the criterion FOM-a at the point Mmin give us a candidate solution with (description L) and ISE E2. We should, of course, exclude from consideration the degraded case when E2→0. The proposed algorithm for nonparametric approximation therefore consists of the following steps: 1) Constructing the R-D curve by means of DP algorithm and calculating the parameter a. 2) Calculating the global minimum of the criterion FOMa in order to find the input parameter (i.e. the number of the segments Mopt or the description length Lopt).
2.2. Parameterized criterion FOM-a
Let us introduce the parameterized Figure of Merit (FOM-a) by using the RDC-parameter as follows: FOMa= E2 ⋅ M a . From (1) it follows that the functional I = Eˆ ⋅ M a is a constant. RDC
2
Although we cannot simultaneously minimize the number of segments M and the approximation error E2, we can find pairs (M, E2) such that the value of the criterion FOM-a is smaller than those for other pairs. The
a1)
b1)
c1)
d1)
d2) c2) Figure 4. Result of polygonal approximation with the heuristic nonparametric algorithms (top row): a1) noisy shape #1, M=31 [11]; b1) shape #3, M=45 [8]; c1) shape #4, M=52 [10]; d1) shape #5, M=47 [10], and with the proposed nonparametric algorithm (bottom row): a2) noisy shape #1, Mopt=28; b2) shape #3, Mopt=35; c2) shape #4, Mopt=22; d2) shape #5, Mopt=21. a2)
b2)
2948
2011 18th IEEE International Conference on Image Processing
We can reduce the time complexity of the proposed algorithm by using the fast algorithms that we earlier developed in [2,3]. Another possible topic for future research is how one may modify the proposed algorithm for the error measure L∞. This can be found by, for example, including the maximum deviation to the cost function [11].
Figure 5. Result of approximation with the proposed nonparametric algorithm: a) the shape#2, Mopt=15 (left) and of two-model approximation of the shape #6, Lopt=45: 3 linear segments and 13 circular arcs (right).
4. CONCLUSIONS
The main results can therefore be summarized as follows: 1) We have introduced the parameterized model of the Rate-Distortion curve for polygonal and multimodel approximation and derived the parameterized multiplicative cost function. 2) By basing our algorithm on an analysis of the RateDistortion curve, we arrived at a solution that contains the best balance between the segments number and the approximation error in terms of the multiplicative cost function. The algorithm can be used in computer graphics, shape analysis, modeling and comparison.
3) Approximating the input curve by using the input parameter that is found. The time complexity of the algorithm is cubic. 3. RESULTS AND DISCUSSIONS
We tested the proposed algorithm on a 2.3 GHz Pentium 4. For the tests, we used vector maps, shapes in MPEG7 CE and digitized curves from [8,10,11]. The results of the experiment are set out for six test shapes (see Fig. 1). Rate-distortion curves for the test shapes are presented on Fig. 2. Comparison of the parameterized criterion FOM-a and the criterion FOM-n is given on Fig. 3. We have thus performed two groups of experiments. Firstly, we have shown that the proposed parametric model with parameter a describes the behavior of R-D curves quite well (see Fig. 2). For the test shapes, the value of the RDC parameter is in the range 1.7 and 3.7. Actually, we are more interested in deviations of R-D curve from its model, because the most deviated point gives the value of input parameter that we are looking for. Secondly, we compared the proposed algorithm with the two nonparametric methods [8,11] (see Figs. 3,4). The proposed algorithm differs from the heuristic methods in the following two ways: 1) we use the parameterized criterion FOM-a; 2) our method is based on the optimal algorithms for approximation. Then we compared the proposed algorithm with the method [10] that is based on the optimal algorithm for polygonal approximation [1] and analysis of the criterion FOM-2. The proposed algorithm is more efficient for two reasons: 1) we use the parameterized criterion FOM-a; 2) we making use of the minimum of the cost function FOM-a, whereas the algorithm [10] is based on a more complicated analysis of the FOM-2 curve with heuristic methods. In total, the proposed nonparametric algorithm with analysis of the parameterized criterion FOM-a provided adequate shape representation by a smaller number of segments than the algorithms [8,10,11] (see Fig.4). The processing time for the proposed algorithm is about 7 sec for 2205-vertex shape #1 and about 200 sec for 6603-vertex shape #2. The processing time for 939vertex shape #6 is about 10 sec.
5. REFERENCES [1] J.C. Perez, and E. Vidal, “Optimum Polygonal Approximation of Digitized Curves”, Pattern Recognition Letters, 15, pp. 743-750, 1994. [2] A.Kolesnikov, and P.Fränti, “Data Reduction of Large Vector Graphics”, Pattern Recognition, 38, pp. 381-394, 2005. [3] A. Kolesnikov, “Fast Algorithm for ISE-bounded Polygonal Approximation”, Proc. IEEE Int. Conf. Image ProcessingICIP’08, pp. 1013-1015, 2008. [4] A. Kolesnikov, “Minimum Description Length Approximation of Digital Curves”, Proc. IEEE Int. Conf. Image Processing-ICIP ’09, pp. 449-452, 2009. [5] A. Gribov, and E. Bodansky, “A New Method of Polyline Approximation”, Proc. Int. Conf. Structural, Syntactic and Pattern Recognition, LNCS, vol. 3138, pp. 504-511, 2004. [6] D. Sarkar, “A Simple Algorithm for Detection of Significant Vertices for Polygonal Approximation of Chain-Coded Curves”, Pattern Recognition Letters, 14(12): pp. 959-964, 1993. [7] P. Rosin, “Techniques for Assessing Polygonal Approximation of Curves”, Trans. Pattern Analysis and Machine Intell., 19(6), pp. 659-666, 1997. [8] M. Marji, and P. Siy, “Polygonal Representation of Digital Planar Curves through Dominant Point Detection – A Nonparametric Algorithm”, Pattern Recognition, 37, pp. 21132130, 2004. [9] A. Held, K. Abe, and C. Arcelli, “Towards a Hierarchical Contour Description via Dominant Point Detection”, IEEE Trans. Sys. Man, and Cyber., 24: 942-949, 1994. [10] A. Carmona-Poyato, F.J. Madrid-Cuevas, R. MedinaCarnicer, and R. Mundoz-Salinas, “A New Measurement for Assessing Polygonal Approximation of Curves”, Pattern Recognition, 44 (1), pp. 45-54, 2011. [11] T.P. Nguyen, and I. Debled-Rennesson, “Parameter-free Method for Polygonal Representation of Noisy Curves”, Proc. 13th Int. Workshop on Combinatorial Image AnalysisIWCIA'2009, 2009.
2949