Predicting Wavelet Coefficients Over Edges Using Estimates Based on ...

Predicting Wavelet Coefficients Over Edges Using Estimates Based on Nonlinear Approximants Onur G. Guleryuz Epson Palo Alto Laboratory 3145 Porter Drive, Palo Alto, CA 94304 [email protected] January 9, 2004 Abstract It is well-known that wavelet transforms provide sparse decompositions over many types of image regions but not over image singularities/edges that manifest themselves along curves. It is now widely accepted that, on 2D piecewise smooth signals, wavelet compression performance is dominated by coefficients over edges. Research in this area has focused on two tracks, each suffering from issues related to translation invariance. Methods that directly model high order coefficient dependencies over edges have to combat aliasing issues, and new transforms that have been designed lose their full strength if they are not used in a translation invariant fashion. In this paper we combine these approaches and use translation invariant, overcomplete representations to predict wavelet edge coefficients.By starting with the lowest frequency band of an l level wavelet decomposition, we reliably estimate missing higher frequency coefficients over piecewise smooth signals. Unlike existing techniques, our approach does not model edges directly but implicitly obtains boundaries by aggressively determining regions where the utilized translation invariant decomposition is sparse.

1

Introduction

One of the key problems in wavelet image compression is the compressibility of wavelet coefficients over edges. For one dimensional piecewise smooth signals it can be shown that wavelet representations, and hence compression applications based on wavelets, are immune to localized singularities [4]. For two dimensional piecewise smooth signals however, it is now widely recognized that edges lead to a non-sparse set of wavelet coefficients and compression performance is dominated by localized singularities which manifest themselves along curves [7]. Researchers have been trying to address this problem by systematically following two main tracks: First, by better modeling wavelet coefficients over edges, higher order dependencies can be exploited, and the number of bits spent on such coefficients by compression codecs can be reduced (see for e.g., [1, 2, 6] and references therein). Second, by designing new representations and transforms, it may be possible to convert the 2D problem into the 1D case where edges are reduced to point singularities and are encoded with a much reduced number of bits (see for e.g., [17, 10, 18] and references therein).

First track approaches operate on naturally decimated wavelet coefficients but they have to combat aliasing concerns in designing their models. In a related fashion, some of the key properties of the best representations designed via the second track can only be exploited via translation/rotation invariant, overcomplete transforms [5, 14]. But the use of overcomplete transforms gives rise to a dilemma in compression where one must first represent the input signal with an overcomplete expansion (which significantly increases the amount of information to encode) and then somehow obtain a compressed bitstream that competes with today’s state-of-the-art codecs in a ratedistortion sense. In this paper, we combine the two tracks and use overcomplete representations in wavelet domain to determine higher order dependencies for wavelet coefficients over singularities. Since we use overcomplete representations to construct adaptive predictors of wavelet coefficients, it can be seen that the techniques proposed in this paper are designed to apply the results of the second track research at full-strength in providing solutions to the first track problem. Unlike existing literature that tries to model singularities directly, our approach forms effective, simple, and robust models for “non-singularities” through the use of sparse overcomplete decompositions. We assume that we are given an overcomplete set of localized linear transforms that are expected to provide sparse decompositions over the signal of interest, i.e., each one of the transforms in the overcomplete set yields many coefficients that have small magnitudes. By applying these transforms over the signal and hard-thresholding the resultant transform coefficients we adaptively determine the set of insignificant coefficients for each transform as the indices of those coefficients that are thresholded to zero. This set is used to establish sparsity constraints, which are used to estimate the high order dependencies of wavelet coefficients. Each of the localized transforms in the overcomplete set has “sparse regions”, where it produces the advertised sparse set of coefficients, and regions over singularities, where sparsity properties fail. Our method adaptively determines and prefers the sparse regions of each transform in forming overall estimates. Interestingly, this aggressive determination of sparse regions brings about the accurate determination of edges which form their boundaries. Our process can intuitively be thought of as leading to estimates that utilize nonlinear approximants (given a set of transforms and thresholds) via the chain, observed signal → set of insignificant coefficients → → sparsity constraints → nonlinear approximants of the signal → estimates. (1) In a nutshell, we start from an observed signal that only contains the low frequency wavelet coefficients (only the lowest frequency band of a 2D l level wavelet transform), we treat the remaining coefficients as “missing” data, and apply our techniques to predict the “missing” high frequency coefficients. This general prediction can be used in a variety of ways, for example, as part of a wavelet codec that affects the prediction to determine probability models for the next coefficient to be encoded, that uses DPCM style encoding, or as part of a wavelet decoder that does postprocessing given the decoded information. Hence our algorithms can easily be combined with today’s coders without necessitating complete redesigns. We leave image coder integration

issues to another paper, and concentrate on prediction results with mean squared prediction error as our benchmark. In that extent we note that, unlike other interpolation methods that operate in wavelet domain (see for e.g., [3]), our approach does not require noise preconditioning and singularity detection/extrapolation. Similarly unlike prevalent superresolution techniques, we do not require multiple images, nor do we need to estimate disparity fields for prediction. Intuitively, our approach can be considered as an adaptive and progressive version of the ideas in [15]. On a piecewise smooth signal the operation of this paper can be described as follows: Given available data, and given that we strongly believe that certain portions of the signal are smooth (as established through sparsity constraints) what is the best estimate of the missing data? The paper is organized as follows: §2 describes the main prediction framework used in this paper. After the overview of §2.1, we discuss sparsity constraints in §2.2, provide iterative solutions in §2.3, and show the need for progressive solutions in §2.4. In §2.5 we detail our main algorithm and extend it to overcomplete basis in §2.6. §3 includes simulation results and concluding remarks.

2

Main Estimation Framework

From the point of view of our estimation framework whether the missing data is in pixel or wavelet domain is immaterial, hence in this section we describe our estimation process for a signal that is missing some of its samples. While we will be using overcomplete transforms, the main ideas and derivations are best illustrated for a single transform for purposes of clarity and ease of notation. We address generalizations to overcomplete transforms in §2.6 using the notation we establish in §2.1 through §2.5.

2.1

Overview

Let x (N × 1) be an N dimensional signal and assume we are given a linear, orthonormal transform G (N × N ). Let giT (1 × N ), i = 1, . . . , N , denote the transform basis (the rows of G), and let ci = giT x, i = 1, . . . , N , denote the corresponding transform coefficients of x. We have x =

N X

(2)

c i gi .

i=1

Define the insignificant set V (x) = {i||ci | < T } for some threshold T . The cardinality of V (x) is card(V (x)) = N − K. Our main assumption is that x =

X

i6∈V (x)

c i gi +

X

i∈V (x)



c i gi =

X

c i gi ,

(3)

i6∈V (x)

i.e., nonlinear approximation with G using K = K(T ) coefficients renders a close ∼ approximation to x. Observe that this is equivalent to assuming that |ci | = 0, i ∈ V (x), since we are using orthonormal transforms. We further assume that K