Chapter
34
Length-Limited
Coding
Lawrence L. Larmore* Daniel S. Hirschberg#
Abstract An O( nL)-time algorithm is given for finding an optimal prefix-free binary code for a weighted alphabet of size n, with the restriction that no code string be longer than algorithm is given for the corresponding alphabetic problem, L. An 0 (nLlogn)-time which is equivalent to optimizing a dictionary of n words, implemented as a binary tree of height h I: L with all data in the leaves.
1. Introduction Euflman’a with
which
Suppose
the ith symbol
a prefix-free2 $vi(
problem.
binary
of C is transmitted.
a code where the expected between
Huffman’s be implemented
solution
symbols
property, (contained
Department
of a code string
In [HuTu], problem
of C) determines
Figure 1 shows the
code in time sorted
in which
0 (nlog n), and can
[L].
Hu and Tucker
in the leaves) must be encountered
of Mathematics
symbol
depth r=
code.
i.e., if the tree were to be traversed
# Department of Information r In this paper, each non-leaf
.th
the z
time if the wi are already problem.
n leaves determines
the weighted
is minimized.
tree and a binary
of Huffman’s
tree could be used as a binary --_______-___rl_c-___ l
minimizes
[Hufj finds such an optimal
Coding
to the variation
alphabetic
a binary
to run in O(n)
The Alphabetic
tree1 with
of the leaf in 2’ containing
length
algorithm
of size n, and 4; is the frequency
A binary
code for C, and a tree which
w h ere lj is the depth
correspondence
C is an alphabet
present
the desired in symmetric in alphabetic
an 0 (nlog n)
tree must order, order.
satisfy the the Thus, the
search tree.
and Computer
Science,
University
and Computer Science, University node in a binary tree has exactly
of California,
Riverside,
CA 92521.
of California, Irvine, CA 92717. two children. code is that ’ A code is prefix-free if no code string is a prefix of any other. The advantage of a prefix-free code strings can differ in length, yet any coded message can be decoded unambiguously. 310
The non-alphabetic problem
by simply
sorting
Height-limited solution
optimal
are sometimes
Limited
Alphabetic Previous
in O(nLZL>
to as the Length-Limited
The Length-Limited
time by Larmore
Coding
problem
algorithms
[I] [WI.
in O(n2L)
and the Length-
time by Garey
problem
is solved
[Gal, and in
a simple
0 (nL)-time
algorithm.
can be solved in 0 (nlog n) time by the Hu[HuTu]
Th e restricted-length
[GaWa].
version
is
[Gal, and in 0 (n2L) time by Itai and Wessler,
In this paper we present
Package-Merge
an 0 (nLlog
n)-time
algorithm.
algorithm
In this section,
we introduce
problem,
the Coin Collector’s
and the Package-Merge
We then show how an instance n and L can be reduced Package-Merge
These two
Coding
CL]. This paper contains
solved in 0 ( n3L) time by Garey
the Knapsack
Coding
(non-alphabetic)
we call the Package-Merge
and Garsia-Wachs
2. The
L is a given constant.
restricts
Coding problems.
The Alphabetic
independently
on the above two problems
at most L, where
time by Hu and Tan [HuTa],
which
to the alphabetic
[HuTa]. A variation
trees.
referred
results.
0 ( n1*5Llog0*5n) algorithm,
can be shown to be reduced
the weights
trees to have height
problems
Tucker
problem
algorithm
algorithm
of the Length-Limited
to an instance
problem, which
Coding
of the Coin-Collector’s
thus solves the Length-Limited
which
is a version
solves it in linear
problem
with
problem
of size nL.
Coding
problem
of time.
parameters The
in 0 (nL)
time. The Coin CoZEector’s probEem. denominations binary
coinage,
collector (rather
(face values)
wishes to spend Q dollars
can the coin collector total
and various
and so the denomination
unimaginatively)
A coin collector
has m coins of various
numismatic
values.
The country
of each coin is an integral
(Q is an integer)
to buy groceries,
power
he lives in has of 2. The
but the grocer
refuses to accept any coin at other than its face value. choose a set of coins of minimum
tot al numismatic
How
value whose
face value is &? An instance
(1, Q) of the Coin Collec tar
‘3
311
problem
of size m is formally
defined
by: (a) A set I of m items, weight.
(Think
each of which
has a width 2-d (d E N) and a non-negative
of width as being face value of a coin, and weight
as being numismatic
value.) (b) An integer A solution sum to exactly
to such an instance
is a subset
S of I of minimal
weight
whose widths
&.
The general restriction
Q.
that
Knapsack
the widths
problem
is NP-complete.
are of the form
However,
2-d, the resulting
by adding
the
Coin-Collector’s
problem
can be solved efficiently. The Package-Merge “packages.”
Each package
each list consists Initially,
sorted
of packages
by weight.
items of the smallest
of all the same width,
discarded. weight.
Finally, S is then
by itself,
width
width
lists of
is 2-d for some d E N, and
combines
of increasing
the two smallest
to form a single package list.
An odd package
there is only one list, consisting to be the union
maintains
weight.
and each list is the set of all items of a given
into the appropriate
t&en
algorithm
sorted in order
Each step of the algorithm
remaining
is then inserted
The Package-Merge
is a set of items whose total
each item is a package
width,
which
algorithm.
of the first
weight
of the next larger of width
of packages
of width
& of these.
Figure
width,
less than 1, sorted
1 is
by
2 illustrates
algorithm. PACKAGE-MERGE
ALGORITHM
Let D be such that 2-O IS - the smallest width of any item A, is the list of items of width 2-4 sorted by weight for d t
D downto 1 loop if A, has odd length then discard its heaviest item Combine adjacent pairs of A, (each element has width 2-4 to form a list B, of packages of width 2-&l Merge B, into A,,
endloop Let S be the union of the & least weight items of A,
Correctness. induction.
We prove that
If all items
have width
the Package-Merge 1, correctness
is trivial. 312
algorithm
is correct
Otherwise,
since
by & is an
the
integer,
S must contain
one such item, of smallest
an even number
discarding
width,
it will not affect the solution.
consider
will be in S, so combining solution.
In either Time
linear
in the length
Packaging
formed
consists
as a result
step combines (placed operation. lists.
of the lists.
solely of original
two items (from
has three credits,
two credits.
time assuming within
that
each width
algorithm
leaves are original
items.
We note that
algorithm
the iteration
These plucked
of width
package
packages
on each item of
Each packaging
each into one item to pay for the of the each
then by weight
Otherwise, 0 (mlog
as a binary
some sorting m) time,
tree, where the
is 0 (m).
otherwise
to cover the case where
& is not an
no solution
Write
of 2 (for example,
of the algorithm
the smallest
by placing
on m items takes linear
by width,
will require
can be modified
powers
step.
inhexed
of width
is possible.
if Q = 3.625, write
by d, if that
power
3 + 2-l
of 2 occurs in
2-d from Ad before executing
will be included
& as
in S, as will the smallest
any [QJ
1.
The reduction
the list of frequencies alphabetic)
The space requirement
the algorithm
is linear
in the sum of the lengths
can be represented
plus a sum of distinct
the sum then pluck
packages
first and the algorithm
rational,
During
at least one credit
is linear
is
have three each) pays for the merge, leaving
Q must be some diadic
an integer, + 2-3).
have two or three credits
the Package-Merge
Each package
argument
on each item of list B,.
(this will be the case in our application).
analysis.
fewer items.
on each item of any list
the items in the lists A, are presorted
must be applied
Space
integer.
Therefore,
with
there are three credits
allowing
One credit from each item (they
will not affect the
lists takes time which
items, two credits
Ad) which
of these
of a list takes time which
two sorted
Invariably,
item,
width
We begin an amortization
The merge step takes time which
item with
other
merging
If there is only
both or neither
to an instance
of a merge, and three credits
on Bd) which
Either
the pairs of elements
of the list while
on each original
any list which
weight.
is reduced
width.
If there are two or more items
them into a single item of larger
in the sum of the lengths three credits
the two of smallest
case the instance
analysis.
of items of the smallest
(Coding (sorted
Length-Limited
problem
-+ Coin Collector’s
into non-increasing Coding
problem,
order)
problem).
Let 41, . . . $n be
for an instance
of the (non-
and let L be the maximum 313
permitted
length.
We define a node to be an ordered
i, weight
$j, level 1, and width 2-l.
the weights {(i,l)
(or widths)
1 l