Length-Limited Coding

Report 3 Downloads 40 Views
Chapter

34

Length-Limited

Coding

Lawrence L. Larmore* Daniel S. Hirschberg#

Abstract An O( nL)-time algorithm is given for finding an optimal prefix-free binary code for a weighted alphabet of size n, with the restriction that no code string be longer than algorithm is given for the corresponding alphabetic problem, L. An 0 (nLlogn)-time which is equivalent to optimizing a dictionary of n words, implemented as a binary tree of height h I: L with all data in the leaves.

1. Introduction Euflman’a with

which

Suppose

the ith symbol

a prefix-free2 $vi(

problem.

binary

of C is transmitted.

a code where the expected between

Huffman’s be implemented

solution

symbols

property, (contained

Department

of a code string

In [HuTu], problem

of C) determines

Figure 1 shows the

code in time sorted

in which

0 (nlog n), and can

[L].

Hu and Tucker

in the leaves) must be encountered

of Mathematics

symbol

depth r=

code.

i.e., if the tree were to be traversed

# Department of Information r In this paper, each non-leaf

.th

the z

time if the wi are already problem.

n leaves determines

the weighted

is minimized.

tree and a binary

of Huffman’s

tree could be used as a binary --_______-___rl_c-___ l

minimizes

[Hufj finds such an optimal

Coding

to the variation

alphabetic

a binary

to run in O(n)

The Alphabetic

tree1 with

of the leaf in 2’ containing

length

algorithm

of size n, and 4; is the frequency

A binary

code for C, and a tree which

w h ere lj is the depth

correspondence

C is an alphabet

present

the desired in symmetric in alphabetic

an 0 (nlog n)

tree must order, order.

satisfy the the Thus, the

search tree.

and Computer

Science,

University

and Computer Science, University node in a binary tree has exactly

of California,

Riverside,

CA 92521.

of California, Irvine, CA 92717. two children. code is that ’ A code is prefix-free if no code string is a prefix of any other. The advantage of a prefix-free code strings can differ in length, yet any coded message can be decoded unambiguously. 310

The non-alphabetic problem

by simply

sorting

Height-limited solution

optimal

are sometimes

Limited

Alphabetic Previous

in O(nLZL>

to as the Length-Limited

The Length-Limited

time by Larmore

Coding

problem

algorithms

[I] [WI.

in O(n2L)

and the Length-

time by Garey

problem

is solved

[Gal, and in

a simple

0 (nL)-time

algorithm.

can be solved in 0 (nlog n) time by the Hu[HuTu]

Th e restricted-length

[GaWa].

version

is

[Gal, and in 0 (n2L) time by Itai and Wessler,

In this paper we present

Package-Merge

an 0 (nLlog

n)-time

algorithm.

algorithm

In this section,

we introduce

problem,

the Coin Collector’s

and the Package-Merge

We then show how an instance n and L can be reduced Package-Merge

These two

Coding

CL]. This paper contains

solved in 0 ( n3L) time by Garey

the Knapsack

Coding

(non-alphabetic)

we call the Package-Merge

and Garsia-Wachs

2. The

L is a given constant.

restricts

Coding problems.

The Alphabetic

independently

on the above two problems

at most L, where

time by Hu and Tan [HuTa],

which

to the alphabetic

[HuTa]. A variation

trees.

referred

results.

0 ( n1*5Llog0*5n) algorithm,

can be shown to be reduced

the weights

trees to have height

problems

Tucker

problem

algorithm

algorithm

of the Length-Limited

to an instance

problem, which

Coding

of the Coin-Collector’s

thus solves the Length-Limited

which

is a version

solves it in linear

problem

with

problem

of size nL.

Coding

problem

of time.

parameters The

in 0 (nL)

time. The Coin CoZEector’s probEem. denominations binary

coinage,

collector (rather

(face values)

wishes to spend Q dollars

can the coin collector total

and various

and so the denomination

unimaginatively)

A coin collector

has m coins of various

numismatic

values.

The country

of each coin is an integral

(Q is an integer)

to buy groceries,

power

he lives in has of 2. The

but the grocer

refuses to accept any coin at other than its face value. choose a set of coins of minimum

tot al numismatic

How

value whose

face value is &? An instance

(1, Q) of the Coin Collec tar

‘3

311

problem

of size m is formally

defined

by: (a) A set I of m items, weight.

(Think

each of which

has a width 2-d (d E N) and a non-negative

of width as being face value of a coin, and weight

as being numismatic

value.) (b) An integer A solution sum to exactly

to such an instance

is a subset

S of I of minimal

weight

whose widths

&.

The general restriction

Q.

that

Knapsack

the widths

problem

is NP-complete.

are of the form

However,

2-d, the resulting

by adding

the

Coin-Collector’s

problem

can be solved efficiently. The Package-Merge “packages.”

Each package

each list consists Initially,

sorted

of packages

by weight.

items of the smallest

of all the same width,

discarded. weight.

Finally, S is then

by itself,

width

width

lists of

is 2-d for some d E N, and

combines

of increasing

the two smallest

to form a single package list.

An odd package

there is only one list, consisting to be the union

maintains

weight.

and each list is the set of all items of a given

into the appropriate

t&en

algorithm

sorted in order

Each step of the algorithm

remaining

is then inserted

The Package-Merge

is a set of items whose total

each item is a package

width,

which

algorithm.

of the first

weight

of the next larger of width

of packages

of width

& of these.

Figure

width,

less than 1, sorted

1 is

by

2 illustrates

algorithm. PACKAGE-MERGE

ALGORITHM

Let D be such that 2-O IS - the smallest width of any item A, is the list of items of width 2-4 sorted by weight for d t

D downto 1 loop if A, has odd length then discard its heaviest item Combine adjacent pairs of A, (each element has width 2-4 to form a list B, of packages of width 2-&l Merge B, into A,,

endloop Let S be the union of the & least weight items of A,

Correctness. induction.

We prove that

If all items

have width

the Package-Merge 1, correctness

is trivial. 312

algorithm

is correct

Otherwise,

since

by & is an

the

integer,

S must contain

one such item, of smallest

an even number

discarding

width,

it will not affect the solution.

consider

will be in S, so combining solution.

In either Time

linear

in the length

Packaging

formed

consists

as a result

step combines (placed operation. lists.

of the lists.

solely of original

two items (from

has three credits,

two credits.

time assuming within

that

each width

algorithm

leaves are original

items.

We note that

algorithm

the iteration

These plucked

of width

package

packages

on each item of

Each packaging

each into one item to pay for the of the each

then by weight

Otherwise, 0 (mlog

as a binary

some sorting m) time,

tree, where the

is 0 (m).

otherwise

to cover the case where

& is not an

no solution

Write

of 2 (for example,

of the algorithm

the smallest

by placing

on m items takes linear

by width,

will require

can be modified

powers

step.

inhexed

of width

is possible.

if Q = 3.625, write

by d, if that

power

3 + 2-l

of 2 occurs in

2-d from Ad before executing

will be included

& as

in S, as will the smallest

any [QJ

1.

The reduction

the list of frequencies alphabetic)

The space requirement

the algorithm

is linear

in the sum of the lengths

can be represented

plus a sum of distinct

the sum then pluck

packages

first and the algorithm

rational,

During

at least one credit

is linear

is

have three each) pays for the merge, leaving

Q must be some diadic

an integer, + 2-3).

have two or three credits

the Package-Merge

Each package

argument

on each item of list B,.

(this will be the case in our application).

analysis.

fewer items.

on each item of any list

the items in the lists A, are presorted

must be applied

Space

integer.

Therefore,

with

there are three credits

allowing

One credit from each item (they

will not affect the

lists takes time which

items, two credits

Ad) which

of these

of a list takes time which

two sorted

Invariably,

item,

width

We begin an amortization

The merge step takes time which

item with

other

merging

If there is only

both or neither

to an instance

of a merge, and three credits

on Bd) which

Either

the pairs of elements

of the list while

on each original

any list which

weight.

is reduced

width.

If there are two or more items

them into a single item of larger

in the sum of the lengths three credits

the two of smallest

case the instance

analysis.

of items of the smallest

(Coding (sorted

Length-Limited

problem

-+ Coin Collector’s

into non-increasing Coding

problem,

order)

problem).

Let 41, . . . $n be

for an instance

of the (non-

and let L be the maximum 313

permitted

length.

We define a node to be an ordered

i, weight

$j, level 1, and width 2-l.

the weights {(i,l)

(or widths)

1 l