1183
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-34, NO. 12, DECEMBER 1986
Maximum Entropy and the Method of Moments in Performance Evaluation of Digital Communications Systems of weighted random variables with known densities and the weights are samples of a channel impulse response. However, the density of the interference itself is most often unknown, theoretically. Therefore, subject to the available moments, one can estimate the unknowndensity f ( x ) by maximizingthe entropy function with respect tof(x). We accept the approximate estimated density heuristically, since we have not proved it t o be the actual one. In applications described, once the probability density of the interference becomes available, one can average the conditional error probability over the interference. Other applications of maximum entropy have been treated elsewhere [7], [8]. These are moment problems that arise in physicsand spectral estimation. Among the moment probI. INTRODUCTION lems, estimation of multimodalprobability density ofstates for NTROPY maximization historically has roots in the work a harmonic crystal and maximum entropy prediction for the of C. E. Shannon [l]. For discrete systems, maximizing isotropic Heisenberg model have been worked out in [7]. the source entropy results in the best source encoding, in the For the application in this paper perhaps it is appropriate at sense of enabling the highest information rate over a fixed this point to outline briefly some other moment methods for capacity channel. Later E. T. Jaynes [2] applied the entropy comparison purposes. One possibilityis to expand f ( x ) in maximization to some problems in statistical mechanics. Since some set of orthogonal polynomials. The resulting series is then, there have been many conceptual ramifications of the truncated after N + 1 terms and the coefficients or weights in rationale of maximum entropy [3]-[6]. For example, it is the expansion are determined by utilizing the first N + 1 notedin [SI thatmost of thewell-known distributions in moments of the unknown probability density function. This statistics are maximum entropy distributions, given appropri- requires solution of a systemof N + 1 linear equations. ate simple moment constraints. For example, the maximum Proper choice of weighted orthogonal polynomials leads to entropy density under the constraints of zero mean and second fast convergence as N grows. An improper choice, however, moment being u2 is a normal density with zero mean and u2 can lead to oscillating approximations to f ( x ) , and there is variance. In other words, among many zero-mean densities further inaccuracy from lack of positivity at each stage of with a second moment equal to u2, normal density maximizes iterations. A popular choice for orthogonal polynomials are the entropy function and is the most unbiased one. Hermite polynomials [9], [ 101. It should be noted that Murphy In this work we shall extend the application of maximum [ 111 and Nakhla [ 121 use Legendre and Chebyshev polynomientropy to evaluation of average error probability in digital als, respectively. Nakhla’s [12] result is related to Murphy’s communication. In this application the sampled received signal by way of an approximation in evaluating the coefficients in has to bedetectedin the presence ofGaussiannoiseand the series expansion. For a large number of problems, both interference. In general, one can evaluate a conditional error results exhibit good convergence properties with respectto the probability that is conditioned on the interference sample. The number of momentsrequired. Powerful alternatives have been interference, however, may have an unknownprobability developed [ 131 over many years. For example, Gauss quadradensity f ( x ) with a finite number of knownmoments. ture rules [14] (GQR) were applied to evaluationof error Examples found inpractice are intersymbol interference (ISI), probability due to IS1 in digital communication by Benedetto e# cross-coupling interference (CCI) or cross-polarization inter- al. [I51 (also see [16]). Here, the unknown density is defined ference (CPI), multiple access interference, etc. Although the by the quadrature rule { w,, 3;.} a set of weights and nodes. interference is a function ofrandom variables withknown Using N known moments ( N = 2No + 1) entails the solution probability densities, the evaluation of its probability density is to a set of nonlinear equations by diagonalization of a not always practical. For example, linear IS1 comprises a sum tridiagonal Jacobi matrix [14]. The corresponding numerical results are stable. It should also benotedthatin certain problems [12], convergence properties similar to or even Paperapproved by the Editor for ModulationTheoryandNonlinear better than the Gauss quadrature rule were observed, using the Channels of the IEEE Communications Society. Manuscript received October series expansion method. However, as shown in this work and 8, 1985; revised June 19, 1986. This paper was presented at the Information in other.applications [7], the maximum entropy method is also SciencesandSystems Conference, Princeton University, Princeton, NJ, stable andmay produce results as accurate as GQRusing March 1986. fewer moments. Following these introductory remarks, in M. Kavehrad is with AT&T Bell Laboratories, Holmdel, NJ 07733. M. Joseph is with the Department of Computer Science, City College of Section I1 we formulate the maximum entropy problem. New York, New York, NY 10031. Applications indigital communication are described in Section IEEE Log Number 861 1274. 111. Also, numerical results and comparison to GQR results are
Abstract-The maximum entropy criterion for estimating an unknown probability density function fromits moments is applied to the evaluation of theaverageerror probability in digitalcommunications. Accurate averagesare obtained,even when afewmoments are available. The method is stable and results compare well with those from the powerful and widely used Gauss quadrature rules (GQR) method. For test cases presented in this work, the maximum entropy method achieved results withtypically a few moments, while the GQR method requiredmany more moments to obtain the same, as accurately. The method requires about the same number of moments as techniques based on orthogonal expansions. In addition, it provides an estimate of the probability density function of the target variable in a digital communication application.
E
2
0090-6778/86/1200-1183$01.OO 0 1986 IEEE
1184
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-34, NO. 12, DECEMBER 1986
reportedin the same section. Finally, our conclusions are presented in Section IV.
11. FORMULATION AND NUMERICAL PROCEDURE We would like to maximize the entropy functionwith respect to the unknown densityf ( x ) . Instead we can minimize the following with respect to the same:
[
b
H(f) = f ( x ) In [ f ( x ) ] dx
Minimize
(1)
a
l:
Subject to: p i =x i f ( x )
dx
(i= 1, 2,
* * *
, N) (2)
with
1
( i = 1, 2,
b
po =
f ( x ) d x =1
(3)
0
and
b<w.
-?