Coding Theorems for Individual Sequences - Semantic Scholar

Report 4 Downloads 144 Views
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-24, NO. 4, JULY 1978

Coding Theorems for Individual Sequences JACOB ZIV, FELLOW, IEEE

Abstract—A quantity called the finite-state complexity is assigned to every infinite sequence of elements drawn from a finite set. This quantity characterizes the largest compression ratio that can be achieved in accurate transmission of the sequence by any finite-state encoder (and decoder). Coding theorems and converses are derived for an individual sequence without any probabilistic characterization, and universal data compression algorithms are introduced that are asymptotically optimal for all sequences over a given alphabet. The finite-state complexity of a sequence plays a role similar to that of entropy in classical information theory (which deals with probabilistic ensembles of sequences rather than an individual sequence). For a probabilistic source, the expectation of the finite state complexity of its sequences is equal to the source's entropy. The finite state complexity is of particular interest when the source statistics are unspecified.

405

412

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-24, NO. 4, JULY 1978

ACKNOWLEDGMENT This work is an outgrowth of [1] and [2], and the author's debt to A. Lempel, the coauthor of [1] and [2], is obvious. The author wishes to acknowledge with thanks helpful discussions with H. S. Witsenhausen, L. A. Shepp, D. Slepian, S. P. Lloyd, and A. D. Wyner. In particular, A. D. Wyner contributed significantly to the proof of Theorem 5. REFERENCES [1] A. Lempel and J. Ziv, "On the complexity of an individual sequence, " IEEE Trans. Inform. Theory, vol. IT-22, pp. 75-81, Jan. 1976.

[2] J. Ziv and A. Lempel, "A universal algorithm for sequential data-compression, " IEEE Trans. Inform. Theory, vol. IT-23 pp. 337-343, May 1977.

[3] R. M. Gray and L. D. Davisson, "Ergodic decomposition of stationary discrete random processes, " IEEE Trans. Inform. Theory, vol. IT-20, pp. 625-636, Sept. 1974.

[4] P. Martin-Lof, "The definition of random sequences, " Inform. and Contr., vol. 9, pp. 602-619, 1966.

[5] G. Chaitin, "A theory of program size formally identical to information theory, " J. ACM, vol. 22, pp. 329-340, July 1975. [6] J. Ziv, "Coding of sources with unknown statistics—Part I, probability of encoding error, " IEEE Trans. Inform. Theory, vol. IT-18, pp. 384-394, May 1972. [7] R. G. Gallager, Information Theory and Reliable Communications. New York: Wiley, 1968. [8] P. Elias, "Universal codeword sets and representations of integers, " IEEE Trans. Inform. Theory, vol. IT-21, pp. 194-203, Mar. 1975.

[9] M. Rodeh, "String matching algorithms and their application to data compression, " Ph. D. dissertation supervised by S. Even, Dep. Computer Science, Technion, Haifa, Israel, 1976. [10] I. Shperling "On the asymptotic complexity of sequences, " M. Sc. thesis, Dep. Electrical Engineering, Technion, Haifa, Israel, 1976. [11] R. M. Gray "Sliding-block source coding, " IEEE Trans. Inform. Theory, vol. IT-21, no. 4, pp. 357- 368, July 1975. [12] K. Winkelbauer, "On discrete information sources, " Trans. of the Third Prague Conference on Inform. Theory, Decision Functions, and Random Processes, 1962. Czechoslovak Academy of Sciences, Prague, 1964, pp. 765-830. [13] R. M. Gray, D. L. Neuhoff and J. K. Omura, "Process definitions of distortion-rate-functions and source coding theorems, " IEEE Trans. Inform. Theory, vol. IT-21, no. 5, pp. 524-532, Sept 1975.