I IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 42, NO. 6, JUNE 1994
1548
11. SHAPECONSTRAINTS
On the Statistical Optimality of Locally Monotonic Regression Alfred0 Restrepo (Palacios) and Alan C. Bovik
AbstractAocally monotonic regression is a recently proposed technique for the deterministic smoothing of finite-length discrete signals under the smoothing criterion of local monotonicity. Locally monotonic regression falls within a general framework for the processing of signals that may be characterized in three ways: regressions are given by projections that are determined by semimetrics, the processed signals meet shape constraints that are defined at the local level, and the projections are optimal statistical estimates in the maximum likelihood sense. Here, we explore the relationship between the geometric and deterministic concept of projection onto (generally nonconvex) sets and the statistical concept of likelihood, with the object of characterizing projections under the family of the p-semi-metrics as maximum likelihood estimates of signals contaminated with noise from a well-known family of exponential densities.
I. INTRODUCTION We discuss here a statistical aspect of the concept of projection onto (generally nonconvex) sets of signals, or regression, as it was recently proposed [ 11, [2], for the processing of finite-length discrete signals. One may argue, particularly with optical images, that some signals carry their information explicitly as shape. A shape constraint is a property defined in the natural domain of the signal, e.g., time or space rather than frequency; we consider shape constraints that are defined at the local level rather than at the gobal level, e.g., local monotonicity versus (global) monotonicity. The projections of a signal on a set of signals are defined as the signals in the set that are closest to the signal being projected, under a semi-metric for the space. Local monotonicity [3] is shape constraint for one-dimensional signals that provides a measure of the smoothness of a signal; it sets a limit on the roughness of a signal by limiting how often the signal may have a change of trend (increasing to decreasing or vice versa). In a sense of the word, it limits the frequency of the oscillations that a signal may have without making restrictions on the magnitude of the changes of the signal from each coordinate to the following one. Piecewise constancy, piecewise linearity, and local convex/concavity are other, similar shape constraints that have been explored [4].Algorithms for the computation of the locally monotonic regression of a finite-length signal of length n have a complexity that is exponential [I]; however, much faster algorithms have been developed that employ regression only over a moving window (hence, computation is linear with signal length) [ 5 ] , and that compute a fuzzy approximation to the true locally monotonic regression using a generalized deterministic annealing algorithm [6].
Manuscript received May IS, 1992; revised September 16, 1993. The associate editor coordinating the review of this paper and approving it for publication was Prof. Gonzalo Arce. A. Restrepo (Palacios) was with the Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX 78712. He is now with the Departamento de Ingenieri a E16ctrica, Universidad de 10s Andes, A. A. 4976, Bogotii, Colombia. A. C. Bovik is with the Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX 78712 USA. IEEE Log Number 9400387.
An integer interval la, bl, where a and b are integer numbers, is defined as the subset of the integers that are larger than or equal to a and smaller than or equal to b. An n-point signal (or a discrete signal of length n ) is a real function z having as domain a nonempty integer interval / a , b / , where b - a = 1) - 1. An n-point signal is a point of R" and may be expressed as the n-tuple [ T I .,. . , s,] of the values it takes. The origin [0, . . . 0] of R is denoted as 0. The slope skeleton of an n-point signal z = 1x1,. . . sn] is having components the (n - 1 )-point signal s = [SI,.. . s,-1] s1 = sgn(e,+l - 1%)where sgn is the signum function that is, respectively, one, zero, and minus one, when its argument is positive, zero, or negative. The segments of length r of a signal U : /1, n/ + R' , T 5 n are the restrictions of U to integer intervals of length r . For example, [2, 3, 41 is a segment of length 3 of the signal [I, 2, 3, 41. Constancy and linearity are well-known shape constraints. Less commonly used shape constraints are monotonicity, convexity, concavity, piecewise constancy, piecewise linearity, local monotonicity, and local convex/concavity [4], [6]. A signal is constant if its slope skeleton s is null: s = 0. A signal is said to be nondecreasing if the components of its slope skeleton are nonnegative and nonincreasing if they are nonpositive. A signal is monotonic if it is either nonincreasing or nondecreasing. A signal is said to be convex if its slope skeleton is nondecreasing and concave if its slope skeleton is nonincreasing. These are global shape constraints. A signal is locally monotonic of degree cy (lomc-a) if each of its segments of length n is monotonic. A signal is locally convexlconcave of degree n ( Z o c m ) if each of its segments of length N has a monotonic slope skeleton. These are shape constraints defined at the local level. A signal is linear if its slope skeleton is constant. The algebraic span of the constant signal [l,1 , . .., I ] and the linear signal [l,2 , . . . , n] is the collection of the linear signals of length n . A signal may be segmented into longest constant segments in a unique way. For example, the longest constant segments in [ l , 3, 3, 2, 2, 2, 4, 5, 51 are [l], [3, 31, [2, 2, 21, [4], and [5, 51. A signal is said to be piecewise constant of degree a (pico-n) if, besides the first and last segments, the shortest segments of its segmentation into constant segments have length at least a. Each coordinate point i of a signal s such that sz-l - 2.5, sI+l is nonzero is a point where the slope of s changes and is called a hinge of s. In addition, the first and last coordinates of a signal are called hinges; accordingly, a linear signal (of length larger than one) has exactly two hinges. A signal is said to be piecewise linear of degree cy (pili-a) if the difference between each two consecutive hinges is at least cu; any signal is pili-1. An n-point signal may have as few as two hinges and as many as n hinges. For example, the signal in Fig. l(a) is lOm0-4, IOCO-7, pico-3, and pili-I. The signal in Fig. l(b) is lomo-2, loco-5, pico-1, and pili-3. All of these shape constraints are defined locally and are novel measures of the smoothness of a signal; the larger the degree a , the smoother the signal.
.
.
.
+
111. SEMI-METRICS AND LIKELIHOOD FUNCTIONS A semi-metric differs from a metric in that it may lack the triangle inequality property. A semi-metric measures, in a coordinatewise way, the similarity of a pair of signals. A semi-metric for R'Lis a that is positive definite and symmetric, function d : R x R i [0,x')
IOS3-587X/94$04.00 Q 1994 IEEE
Authorized licensed use limited to: Universita degli Studi di Trieste. Downloaded on February 16, 2009 at 08:47 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 42, NO. 6, JUNE 1994
1549
i M 200
1
,
0
10
20
(b)
Fig. 1. Two discrete signals that satisfy different shape constraints 200
i
that is,
l ) V z , y E R",d(z,y)= 0 e z = y (positive definiteness) 2)Vz,y E R",d(z,y)= d ( y , z )
(symmetry).
A large class of functions are semi-metrics for R"; for measuring the similarity between signals, it is convenient to have a semi-metric that is translation-invariant:
3) Vz, y, z E R", d ( z
+ z, y + z ) = d ( z , y )
(translation invariance). To prove the existence of projections, it is also necessary that d be continuous (in the standard topologies for R2" and R I ) . It is also natural to require that the semi-metric be nondecreasing: for each signal y with nonnegative components and for each n-point signal z,d ( 0 ,z) 5 d ( 8, z y). Many semi-metrics are positive homogeneous as well:
+
4)
V ~ , EY R",V? E R',d(?z,yy) = I l l d ( z . y )
(positive homogeneity). A metric is a semi-metric that has the triangle-inequality property: 5) Vz, Y, z E R". d ( z , z ) I d ( z , y)
+ d ( y ,z )
(triangle inequality). The distance between a point z and a set S is given by
D ( z ,S) = inf { d ( z , s) : s E S } . A . A Family of Semi-Metrics A well-known collection of positive homogeneous and translationinvariant semi-metrics on R" that is indexed by the parameter p E (0, CG) is defined as
dp(2.Y) =
Is, - ?AI (z:l
p)l'p
.
p E (0,m).
For p E [l.m),d, is a metric; for p E (0, l ) ,d, is a semi-metric, but it is not a metric. This family includes the Euclidean metric dz and the square metric rll . These p-semi-metrics are translation-invariant, nondecreasing, and continuous. Assume that a shape constraint has been specified. Let Q be the truth-value function that is true on signals meeting the constraint and false otherwise, and let A = {s E R : Q ( s ) }be the set of signals having the required shape. Given a signal z that is not in A. if A is a nonempty proper subset of R and z E ( W - A ) , the set
P A ( z )= { a E A : d ( z , a )= D , ( z , A ) ) gives the set of the projections of z on A (or of regressions of z with respect to A ) under the semi-metric d. The existence and multiplicity of projections have been characterized in [l]. The distance between
.I""
I
0
10
1
20
(b)
Fig. 2. (a) A signal and (b) its lomo regression.
z and A is given by the smallest radius p for which the boundary of A @ B ( 0 , p ) contains z (% stands for the operator of Minkowski set addition [7]). In Fig. 2, a signal and a corresponding locally monotonic regression are shown. Algorithms for the computation of linear regression are well known. Algorithms that compute locally monotonic projections under the p -semi-metrics d , , p E (0,co)are given in [l]. Algorithms for the computation of projections under the Euclidean metric for other shape constraints such as piecewise linearity and local convex/concavity may be found in [4], [6]. B . A Family of Densities We make use of a family of generalized exponential probability density functions that have been used extensively in robust statistics [8] and in the design of order statistic filters [9]-[ll]. It is defined as follows:
where y and C are positive constants (that depend on p) that determine the variance of the random variable and ensure that each density f, integrates to one. The Gaussian and Laplacian densities are in the family.
C . Likelihood Functions Assume that a system outputs (deterministic) signals of length TI that are characterized by a shape constraint Q , and that any signal meeting the shape constraint may be an output of the system, with uniform probability on the set A = {s E R" : Q ( s ) } ;signals not in A are outputs of the system with probability zero. Also, before the output signal t can be observed, it becomes contaminated with a random signal g. Given an observed noisy signal = t z,it is desired to obtain an estimate s of t . The likelihood L ( z ;a ) of z being equal to a signal a in A plus a sample z from 4 depends both on a and z. Given the distribution of the components of g, the set
+
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 42,
1550
,
x = t+z
a #I I1 # I
Fig. 3. An ML estimate s of the signal t , given the observed noisy signal z.
A, and an observation 2, it is desired to find signals s in A that maximize L ( z ;a) as a ’ranges over A (see Fig. 3). For a white random signal 4 of i.i.d. components having a density f,, the likelihood function L ( z ;a) is given by
L ( z ;a) = y” exp [-C( 1x1 - a1 I’ = 7” exP [-C{d,(a,
+ . . . + IT“
NO. 6, JUNE 1994
regression [13] and isotonic regression [14], [15] are perhaps the most commonly used types of regression in statistics. Conditions of monotonicity arise naturally in certain classes of problems; for example, consider the collection of attention times of the elements of a queue [16]. Projections on the set of monotonic signals are called monotonic regressions. Isotonic regression and related topics have been considered previously for image processing [ 171. Here, we have examined a signal-smoothing paradigm, designed under geometric and deterministic concepts of projection, that is optimal for signal estimation in the maximum likelihood sense. The use of semi-metrics in the projection apparatus, although not common in signal processing, provides estimators of signals contaminated with highly impulsive noise. A family of robust estimators, each optimal for a given density, is obtained. The shape constraints considered here provide criteria of smoothness that emphasize different characteristics of signals. We believe that the use of shape constraints that are defined at the local level, together with the concept of projection provide a powerful tool for the shaping, smoothing, and filtering of signals.
- anlP)]
REFERENCES
Z)Yl
and a monotonic dependence of the likelihood on the distance between z and a appears explicitly. Thus, a maximum likelihood estimate is a signal s in A that is closest to z under the semimetric d,; that is, the maximum likelihood estimates are the p -projections of z on A. In [ 121, an A4 estimate is an estimate T that minimizes a function ; where p is an arbitrary function, and an of the form C, p ( z Z T), ML estimate T as one that minimizes C , In ( x z ;7).Of course, the equivalence between minimizing a loss function and maximizing a likelihood function is well known; here, we point out a particular relationship between maximum likelihood estimation and semi-metrics within the context of projection. D. Semi-Metrics that A re Obtained from Likelihood Functions
As above, the likelihood function is in many cases an increasing function of a semi-metric; in fact, a large class of likelihood functions determine translation-invariant semi-metrics. For example, let 91, g2, . . . gn be a collection of continuous probability density functions that are unimodal with the mode occurring at zero, strictly increasing on (-WO), strictly decreasing on (0, C O ) , and positive on the real line RI. The exponential densities defined above, and many others, e.g., the Cauchy density, meet these requirements. Let g : R” + R’ be given by g(u) = g1(2l1)g2(7)*)...gn(vn), where v = [ V I , 112,. . . ,U,,]. Then, In (9) is a continuous function that is maximal at 8, and the function 5 : R2” + R1 given by
.
is a continuous, nondecreasing semi-metric for R” . If A is the set of signals meeting a given shape constraint Q, g = g 2 , . . . ],, ; is a random signal of independent components where each component gz has a density gt, and = a z is a noisy version of a signal a in A, then, given an observed sample signal z, the likelihood that z results from the addition of a signal a and a sample z of z is L ( z ; a ) = g ( z - a). Under the hypothesis that a is in A, L ( z ;a ) is maximized by a signal s that minimizes {6(z,a ) : a E A}, that is, by any of the projections of z on A under the semi-metric 5.
+
A. Restrepo and A. C. Bovik, “Locally monotonic regression,” IEEE Trans. Signal Processing, vol. ASSP-41, pp. 2796-2810, Sept. 1993. -, “Statistical optimality of locally monotonic regression,” in Proc. SPIEISPSE Conf. Nonlinear Image Processing, Santa Clara, CA, 1990. S. G. Tyan, “Median filtering: Deterministic properties,” in TwoDimensional Digital Signal Processing 11: Transforms and Median Filters, T. S. Huang, Ed. Berlin: Springer-Verlag, 1981, pp. 197-217. A. Restrepo, “Nonlinear regression for signal processing,” in P roc. SPIEISPSE Conf. Nonlinear Image Processing I I , San Jose, CA, 1991,
pp. 89-99. A. Restrepo and A. C. Bovik, “Windowed locally monotonic regression,” presented at the IEEE Int. Conf. Acoust., Speech, Signal Processing, Toronto, Canada, May 1991. S. T. Acton and A. C. Bovik, “Nonlinear regression for image enhancement via generalized deterministic annealing,” presented at the SPIE Conf. Visual Commun. Image Processing, Boston, MA, Nov. 1993. J. Serra, Image Analysis and Mathematical Morphology. London: Academic, 1982. R. V. Hogg, “More light on the kurtosis and related statistics,” J . Amer. Statist. Ass., vol. 61, pp. 422-424, 1972. A. C. Bovik, T. S. Huang, and D. C. Munson, “A generalization of median filtering using linear combinations of order statistics,” lEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 1342-1350, 1983. A. Restrepo and A. C. Bovik, “Adaptive trimmed mean filters for image restoration,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 13261337, 1988. A. Restrepo, “On dynamic characteristics of L-filters and on adaptive L-filters” M.S. thesis, Univ. Texas, Austin, Aug. 1986. P. J. Huber, Robust Statisrics. New York: Wiley, 1981. A. B. Forsythe, “Robust estimation of straight line regression coefficients by minimizing pth power deviations,” Technomet., vol. 14, pp. 159-166, 1972. T. Robertson, F. T. Wright, and R. L. Dykstra, Order Restricted Statistical Inference. Chichester: Wiley, 1988. R. E. Barlow, D. J. Bartholomew, J. M. Bremner, and H. D. Brunk, Statistical Inference Under Order Restrictions. New York: Wiley, 1972. D. J. Bartholomew, “A test of homogeneity for ordered altematives,” Biometrika, vol. 46, pp. 3 U 8 , 1959. A. C. Bovik, T. S. Huang, and D. C. Munson, Jr., “Edge sensitive image restoration using order-constrained least-squares methods,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 1253-1263, 1985.
IV. CONCLUSION The technique of regression is widely used in statistics, but it is rarely used with the purpose of shaping or filtering signals. Linear
Authorized licensed use limited to: Universita degli Studi di Trieste. Downloaded on February 16, 2009 at 08:47 from IEEE Xplore. Restrictions apply.