Displacement Structure and Maximum Entropy - Semantic Scholar

Report 3 Downloads 186 Views
IEEE Transactions on Information Theory, vol. 43, no. 3, pp. 1074{1080, May 1997.

Displacement Structure and Maximum Entropy Tiberiu Constantinescuy, Ali H. Sayedz, and Thomas Kailathx

Abstract The study of matrices with displacement structure is mainly concerned with a recursion for the so-called generator matrices. The recursion usually involves free parameters, which can be chosen in several ways so as to simplify the resulting algorithm. In this paper we exhibit a choice for the parameters that is motivated by a maximum-entropy formulation. This choice further motivates the introduction of so-called generalized re ection coecients which are, in general, di erent from the better known Schur coecients.

Keywords: Maximum entropy, displacement structure, Schur algorithm, re ection coecient. I. Introduction

The maximum entropy extension (or loading) problem has attracted considerable attention in the literature. The rst solution by Burg [1] treated Toeplitz matrices and emphasized their parametrization in terms of so-called re ection coecients, also known as Schur coecients. In this paper, we exploit the fact that the Toeplitz/Schur ideas can be extended to more general classes of matrices by invoking the concept of displacement structure [2], and show that a very general formulation of the  This work was supported in part by the Army Research Oce under contract DAAL03-89-K-0109, and by a grant from the National Science Foundation under award no. MIP-9409319. y T. Constantinescu is with Programs in Mathematical Sciences, University of Texas at Dallas, Richardson, TX 75083, Phone (214) 883{2104, Fax (214) 699{6622, email: [email protected] z A. H. Sayed is with the Department of Electrical Engineering, University of California, Los Angeles, CA 90095, Phone (310)267{2142, Fax (310)206{8495, email: [email protected] x T. Kailath is with the Department of Electrical Engineering, Stanford University, Palo Alto, CA 94305, Phone (415)723{3688, Fax (415) 723{8473, [email protected].

1

maximum entropy problem is possible. In particular, we provide both global and recursive solutions to the generalized problem. The connection between maximum entropy extensions and structured matrices will be established in terms of cascade or transmission-line structures, which arise naturally when the Cholesky factorizations of structured matrices are eciently computed via a generalized Schur algorithm [2]. For a given structured matrix R, the algorithm operates recursively on its so-called generator matrix G and provides, for each step, a rst-order section (or transfer function/operator). Each section is usually parametrized in terms of two free parameters: a J ?unitary rotation matrix i and a complex scalar i that is restricted to lie on a circle of given radius. The details of the algorithm in the time-variant scenario are provided in [3, 4].

S

(0 ; 0 )

(1 ; 1 )

(n ; n )

K

T Figure 1: A transmission line mapping K into S A sequence of (n + 1) steps of the generalized Schur algorithm would lead to a cascade of n such sections, known as a transmission-line and which we will denote by T (see Fig. 1). Under certain positivity and nite-dimensionality conditions [3], the cascade T is known to map, in a certain way, contractive operators K to contractive operators S , written simply as S = T [K ]. Now di erent choices for fi ; i g lead to di erent expressions for the rst-order sections and to di erent forms for the generator recursion itself. For example, one particular choice for fi ; i g, which will be discussed in Section 5.1, allows the generator recursion to be written in a simpli ed socalled proper form, which is often desirable from a computational point of view, e.g., in interpolation problems [3, 4, 5]. Other choices for fi ; i g, while leading to di erent forms for the rst-order sections and for the generator recursion, further allow to impose other desirable properties on the cascade T. The present paper addresses one such issue. More speci cally, it shows how to construct a cascade T, and in particular how to choose the above mentioned free parameters fi ; i g, such that the resulting cascade T will map the zero load (K = 0) to the maximum-entropy solution, as in the classical re-

2

sult [1]. We shall see that, in general, the cascade that corresponds to the proper choice for fi ; i g does not map the zero-load to the maximum-entropy solution. Moreover, we shall be motivated to introduce a new set of contractive coecients, one for each section of the cascade, and which will be shown, in general, to be distinct from the Schur parameters encountered in the proper case (see, e.g., [4, Sec. 5] and [3]). A. Related works in the literature Similar issues of relating the maximum entropy solution to the central solution (corresponding to the zero load) have been addressed in the literature [6, 7, 8]. The work [6] deals with time-dependent entropy problems and also considers contractive extension problems. The reference [7] employs the framework of the lifting of commutants, while the reference [8] employs tools of the W -transform technique studied in [9]. In particular, the work [8] poses a maximum entropy problem in the context of linear fractional transformations that arise in time-variant discrete-time H1 control. The work shows how to choose a particular contractive load that maximizes a time-variant entropy measure, and provides state-space formulae and global expressions for the entropy operator. The current work departs from these earlier works in the sense that it focuses on a recursive (rather than a global) construction of the maximum entropy solution. This is useful in situations when the available data is updated and it is desired to re-evaluate the corresponding maximum entropy solution by exploiting the available cascade from the earlier calculations. A recursive procedure allows us to evaluate this new solution by simply appending a new section to the earlier cascade. Global expressions, on the other hand, need to be evaluated afresh whenever the data is modi ed, which is not convenient in recursive scenarios that arise, for example, in adaptive schemes. We have chosen to present the results of this paper in an operator setting for generality of exposition. The results, however, can be easily specialized to particular situations. B. Notation The symbol ZZ denotes the set of integers, and for two Hilbert spaces H and H0 we write L(H; H0 ) to denote the set of bounded linear operators acting from H into H0 . We further consider three families fU (t); V (t); R(t)gt2ZZ of Hilbert spaces depending on the parameter t 2 ZZ, two families of bounded linear operators G(t) 2 L(U (t)  V (t); R(t)) and F (t) 2 L(R(t ? 1); R(t)), and we de ne the symmetry J (t) = (IU (t)  ?IV (t) ) acting on U (t)  V (t), where IU (t) denotes the iden-

3





tity operator on the space U (t): We partition G(t) = U (t) V (t) ; where U (t) 2 L(U (t); R(t)) and V (t) 2 L(V (t); R(t)): We also use the symbol * to denote the adjoint operator and we write F  (t) = (F (t)) . De nition 1: A family of operators fR(t) 2 L(R(t))gt2ZZ is said to have a time-variant displacement structure with respect to fF (t); G(t)gt2ZZ if fR(t)gt2ZZ is uniformly bounded, viz., there exists r > 0 such that kR(t)k  r for all t 2 ZZ; and R(t) satis es the time-variant Lyapunov (or displacement) equation R(t) ? F (t)R(t ? 1)F  (t) = G(t)J (t)G (t): (1)

The cardinal number r(t) = dim U (t) + dim V (t) is called the displacement rank of R(t) in (1). We say that (1) has a Pick solution if R(t) is positive-semide nite for every t 2 ZZ: Throughout the paper we assume that the following conditions hold (viz., conditions (8a) ? (8e) ?1 Ri (t), for all t; (b) dim Ri (t) in [4]): (a) there exists a positive integer n such that R(t) = in=0 are all equal and nite; (c) dim U (t) and dim V (t) are nite; (d) fF (t)g is a uniformly bounded ?1 (i.e. there family of lower triangular operators with stable families of diagonal entries ffi (t)gin=0 exist ci > 0 such that kfi (t)k  ci < 1 for all t); (e) fG(t)g is a uniformly bounded family. Under these assumptions, the in nite block matrices

U(t) = V(t) =

 

: : : F (t)F (t ? 1)U (t ? 2) F (t)U (t ? 1) U (t) : : : F (t)F (t ? 1)V (t ? 2) F (t)V (t ? 1) V (t)

 

; ;

are well-de ned bounded linear operators, and the displacement equation (1) is guaranteed to have a unique uniformly bounded solution that is given by R(t) = U(t)U (t) ? V(t)V (t): We further assume the following so-called nondegeneracy condition: (f ) the operator U(t)U (t) is uniformly bounded from below, viz., 9  > 0 such that U(t)U (t)   > 0 for all t 2 ZZ. Assumptions (a){(f) allow us to state (see Thm 4.7 in [4]) that the time-variant displacement equation (1) has a Pick solution R(t) such that R(t) > I > 0 for a constant  and for all t 2 ZZ if, and only if, there exists an upper-triangular strict contraction S (kS k < 1), S 2 L(  V (t);  U (t)); t2ZZ t2ZZ such that V(t) = U(t)PU (t)S=  V (j ) for every t 2 ZZ; (2) j t

where PU (t) denotes the orthogonal projection of  U (t) onto  U (j ): t2ZZ

4

j t

Let S denote the set of all upper-triangular strictly-contractive operators S that satisfy (2). For every such S 2 S it follows that I ? S  S is a positive operator. Let S denote its spectral factor (as de ned in [10, 11, 12]). In the following, we write D(A) to denote the diagonal of an upper-triangular operator A. Problem 1: Let S denote the spectral factor of an upper-triangular strictly-contractive operator S 2 S . The maximum entropy problem is to solve the following optimization criterion:

max fD( S ) D( S g: S 2S

(3)

Interpretations, motivations, and applications of problems of this kind abound in the literature. For formulations close to the above one we refer to [6, 7, 8, 13]. II. Solution of the Optimization Problem

De ne the direct sum J =  J (t) and consider a bounded upper-triangular operator t2ZZ

T2L









 U (t)   V (t) ;

t2ZZ

t2ZZ

whose matrix entries fTlj g are partiotioned accordingly with J (l) and J (j ), say 2

T lj T lj Tlj = 64 11lj 12lj T21 T22

3 7 5

: h

i

h

i

h

i

We further construct the upper-triangular operators T11 = T11lj , T21 = T21lj , T12 = T12lj , i h T22 = T22lj , for ?1 < l; j < 1. The operator T will be said to be J?inner (see, e.g., [15, Thm 2.3]) if i) it is J?unitary, i.e., TJT = TJT = J, and ii) T22?1 is a bounded upper-triangular operator. In this case, it follows that T?221 T21 is an upper-triangular strictly contractive operator, kT?221 T21 k < 1: It was shown in [4, Thm. 4.8] that, starting with fF (t); G(t); J (t)g of (1), there exists a bounded upper-triangular J?inner operator T that can be determined as a function of the given fF (t); G(t); J (t)g, and such that S 2 S if and only if there exists K such that

S = T[K ] = ?(T11 K + T12 )(T21 K + T22 )?1 ;

5

(4)

where K is an upper-triangular strictly contractive operator, kK k < 1. We now have the following. Lemma 1: Consider an S 2 S and let K be the associated operator, S = T [K ]. Then its spectral factor S can be chosen according to the formula S = K (T21 K + T22 )?1 . Proof: It follows from the J?innerness of T that

I ? S  S = (K T21 + T22 )?1 (I ? K K )(T21 K + T22 )?1 :

(5)

Let K 2 L(  V (t);  V 0 (t)) be the spectral factor of K; and de ne = K (T21 K + T22 )?1 : We t2ZZ

t2ZZ





thus have that is an upper-triangular operator that obeys the condition that the space  V (j ) j t is dense in  V 0 (j ) for all t 2 ZZ: Moreover, the inequality K K  I ? K K allows us to conclude, j t in conjunction with (5), that   I ? S  S: Now consider any other upper-triangular contraction Z 2 L(  V (t);  V 00 (t)) such that Z Z  I ? S  S: It follows from (5) that t2ZZ

t2ZZ

(K  T21 + T22 )Z  Z (T21 K + T22 )  I ? K K;

and using the properties of the spectral factors, we must certainly have (K  T21 + T22 )Z  Z (T21 K + T22 )  K K : This implies that Z  Z   and, consequently, = K (T21 K + T22 )?1 can be chosen as the spectral factor of S = T [K ] : We are now in a position to state the solution of problem (3) [see also [7, 8] for alternative arguments]. For this purpose, and for notational convenience, we denote the upper-triangular operators T?221 T21 and T?221 by  and ', respectively. Lemma 2: Assume conditions (a){(f) hold and let described above. Then

T be the J?inner upper-triangular operator

max fD( S ) D( S g = [D(T22 )D(T22 ) ? D(T21 )D(T21 ) ]?1 ; S 2S

= [D(T22 ) ]?1 [I ? D()D() ]?1 [D(T22 )]?1 :

Moreover, the maximum is attained if, and only, if S = S0 = T [D() ] : In particular, if D() = 0 or, equivalently, D(T21 ) = 0, then maxS2S fD( S ) D( S g = [D(T22 ) ]?1 [D(T22 )]?1 , and the maximum is attained for S = S0 = T [0] = ?T12 T?221 :

6

Proof: The argument uses Lemma 1 and follows closely the proof of the main result in [13] (which is in Russian). [The monograph [14], esp. Ch. 11, contains a number of examples of maximum entropy problems that might be more accessible to an English reader]. ?1 T21 ) is called the maximum-entropy solution of (3). The The unique S0 = T [D() ] = T D(T22 unique T [0] = ?T12 T?221 is called the central solution since it corresponds to choosing K = 0. As mentioned earlier in the introduction, the above statement provides a global characterization of the maximum entropy solution (and has also been studied in [6, 7, 8]). In particular, note that the expression for the required load is given in terms of the (block) entries of the entire cascade T. The contribution of this paper is to exhibit a recursive construction of the maximum-entropy solution S0 that does not require prior knowledge of the global expression for T. The details are presented in the remaining sections. 



III. A Recursive Solution

The recursive procedure will follow from an algorithm derived in [3, 4] for the triangular factorization of time-variant matrices with displacement structure. n?1 To clarify this, consider block matrices R(t) = [rlj (t)]l;j =0 and let Ri (t) denote the Schur complement of the leading i  i block submatrix of R(t): If li (t) and di (t) stand for the rst block column and the (0; 0) block entry of Ri (t); respectively, then the successive Schur complements of R(t) are recursively related as follows: "

0 Ri (t) ? li (t)d?1 (t)l (t) =

0

#

; R0 (t) = R(t): 0 Ri+1 (t) We further note that the positive-de niteness of R(t) guarantees di (t) > 0 for all i. Also, the notation d?1 (t) stands for d(t)?1 : After n consecutive Schur complement steps we obtain the block triangular factorization of R(t); viz., " # " # 0 0 Rt) = l0 (t)d?0 1 (t)l0 (t) + d?1 1 (t) + : : : = L(t)D?1 L(t); l1 (t) l1 (t) where D(t) = diagfd0 (t); : : : ; dn?1 (t)g is a block diagonal matrix, and the (nonzero parts of the) columns of the block lower triangular matrix L(t) are fl0(t) : : : ; ln?1 (t)g: It was shown in [4, 3] that for structured matrices R(t) as in (1), the triangular factor at time t ? 1; viz., L(t ? 1); can be i

i

7

time-updated to the triangular factor at time t, L(t), via a recursive procedure on the generator matrix G(t) as described below: Start with F0 (t) = F (t); G0 (t) = G(t); and repeat for i  0 :  Choose uniformly bounded sequences fhi (t); ki (t)gt2ZZ that satisfy the following timevariant embedding relation:

fi (t) gi (t) # " di (t ? 1) 0 # " fi (t) gi (t) # " di (t) 0 # = ; hi (t) ki (t) 0 J (t) hi (t) ki (t) 0 J (t) where gi (t) denotes the top block row of Gi (t).  Apply the recursion 2 2 0 3 fi (t) hi (t)J (t) 3 6 7 6 7 6 li (t) 7 = [ Fi (t)li (t ? 1) Gi (t) ] 6 7: 4 5 4 5   Gi+1 (t) J (t)gi (t) J (t)ki (t)J (t) "

(6)

(7)

Moreover, di (t) = fi (t)di (t ? 1)fi (t) + gi (t)J (t)gi (t); and Ri+1 (t) satis es the time-variant displacement equation

Ri+1 (t) ? Fi+1 (t)Ri+1 (t ? 1)Fi+1 (t) = Gi+1 (t)J (t)Gi+1 (t); where Fi+1 (t) is the submatrix obtained after deleting the rst row and column of Fi (t). Let Ti =  denote the upper-triangular transfer operator with time-variant Markov parameters: Tlj(i) l;j

Tll(i) = J (l)ki (l)J (l) ; Tl;l(i)+1 = J (l)gi (l)hi (l + 1)J (l + 1) ; Tlj(i) = J (l)gi (l)fi (l + 1)fi(l + 2) : : : fi (j ? 1)hi (j )J (j ) ; for j > l + 1:

(8)

After n recursive steps we obtain a cascade of sections T = T0 T1 : : : Tn?1 ; which may be regarded as a generalized transmission line. This is the J?inner operator that parametrizes all S 2 S in (4). The choice of fhi (t); ki (t)g in (6) is nonunique and, therefore, the generator matrix Gi+1 (t) in (7) is also nonunique. Each choice for fhi (t); ki (t)g would lead to a valid Gi+1 (t). There are, for instance, special choices for fhi (t); ki (t)g that would lead to considerable simpli cations in the computational requirements, since they lead to what are known as proper generators, as developed in [18] for the time invariant case and in [3] for the time-variant case. But these choices do not generally lead to a maximum-entropy solution. We shall show, however, that it is always possible to nd fhi (t); ki (t)g, usually distinct from the choice in the proper case, so as to result in a cascade T whose central value, viz., T[0] = ?T12 T?221 ,

8

will correspond to the maximum-entropy solution. To achieve this, all we need to do is to exhibit uniformly bounded choices for fhi (t); ki (t)g that would result in a cascade T for which D() = D(T?221 T21 ) = 0. One way to guarantee this is to require that for each individual operator Ti we have D(T?221;i T21;i ) = 0, where the index i in T22;i and T21;i refers to the ith section. But rst let us elaborate on the nonunique choice of fhi (t); ki (t)gt2ZZ so as to satisfy the embedding relation (6). For this purpose, we recall a result in [4, Thm 4.1] where it was shown that the following choices for hi (t)(t) and ki (t) satisfy (6): h 

hi (t) = ?i 1 (t)J (t)gi (t) di2 (t) ? i (t)di2 (t ? 1)fi (t) 



h 

i?1 h

i

i (t)d?i 2 (t ? 1) ? di? 2 (t)fi (t) ;

ki (t) = ?i 1(t) I ? J (t)gi (t) di2 (t) ? i (t)di2 (t ? 1)fi (t) 

1

i?1

1



di? 2 (t)gi (t) ; 1

(9)

for an arbitrary J (t)?unitary operator i (t) and an arbitrary unitary operator i (t), whenever the  1  inverse of di2 (t) ? i (t)di2 (t ? 1)fi(t) exists. Here, di2 (t) denotes the operator de ned by di (t) =  1 di2 (t)di2 (t). [ The nite-dimensionality conditions guarantee that it is always possible to choose a   unitary matrix i (t) so as to assure the invertibility of di2 (t) ? i (t)di2 (t ? 1)fi (t) ]. A speci c choice for i (t), along with the choice i (t) = I , was shown in [4] to guarantee the corresponding fhi (t); ki (t)gt2ZZ , which we shall denote by fh i (t); ki (t)gt2ZZ , to be uniformly bounded. But other choices for (i (t); i (t)g that would guarantee the uniform boundedness of the corresponding fhi (t); ki (t)gt2ZZ are also possible. Examples to this e ect, with speci c values for (i (t); i (t)g, are given later (see, e.g., (13)). With each uniformly bounded choice ki (t), we associate a strict contraction i (t) that is de ned below, and which will be referred to as a generalized re ection coecient. De nition 2: Let fki (t)gt2ZZ be any uniformly bounded sequence that satis es the embedding relation (6), and partition it accordingly with J (t), 3

2

k(11) (t) ki(12) (t) 7 ki (t) = 64 i(21) 5: ki (t) ki(22) (t) The corresponding generalized re ection coecient i (t) is de ned by

i (t) = ?ki(12) (t)(ki(22) (t))?1 2 L(V (t); U (t)):

9

We can now state the main result of this paper. Theorem 1: Assume conditions (a){(f) hold and let R(t) be the unique Pick solution of (1), viz., R(t) > I > 0 for a constant  and for all t 2 ZZ: Then we can always choose uniformly bounded families fhi (t); ki (t)gt2ZZ ; such that the associated J?inner operator T has the property that S0 = T[0] = ?T12T?221 : That is, the central solution of the cascade coincides with the maximum-entropy solution of Problem (3). Proof: The proof is constructive. It follows from Lemma 2 that the central solution T[0] = ?T12 T?221 coincides with the maximum-entropy solution S0 if, and only if, D(T21 ) = 0: We now show how to choose uniformly bounded families fhi (t); ki (t)gt2ZZ so as to guarantee D(T21;i ) = 0 for each i = 0; 1; : : : n ? 1: We have indicated above that it is always possible to nd uniformly bounded families fh i (t)gt2ZZ ;   fki (t)gt2ZZ such that the embedding relation (6) holds. Let T i = Tlj(i) denote the transfer operl;j ator associated with ffi (t); gi (t); h i (t); ki (t)g, as in (8). We conclude from the embedding relation (6) that hi (t)di (t ? 1)h i (t)+ ki (t)J (t)ki (t) = J (t); and, consequently, J (t) ? ki (t)J (t)ki (t) = h i (t)di (t ? 1)h i (t)  0. Since dim U (t) < 1 and dim V (t) < 1, we also conclude that J (t) ? ki (t)J (t)ki (t)  0 for all t 2 ZZ: If we partition ki (t) accordingly with J (t), 2

ki (t) = 64

ki(11) (t) ki(21) (t)

3

ki(12) (t) 7 5; ki(22) (t)

we then obtain ki(22) (t)ki(22) (t)  I + ki(12) (t)ki(12) (t): Therefore, ki(22) (t)(t) is invertible and k(ki(22)(t))?1 k  1: We also know that kki(22)(t)(t)k  M for a certain M > 0: We now de ne the corresponding generalized re ection coecient

i (t) = ?ki(12) (t)(ki(22) (t))?1 2 L(V (t); U (t)); which satis es

I ? i (t)i (t)  ki(22) (t)(t)?1 ki(22) (t)?1  M1 2 :

(10)

Hence, (I ? i (t)i (t))?1  M 2 : Moreover, from the identity (I ? i (t)i (t))?1 = I + i (t)(I ? i (t)i (t))?1 i (t) we obtain that I ?i (t)i (t))  1+M 2: We further de ne the family of J (t)?unitary matrices i (t) = H (i (t)), and remark that it is uniformly bounded. Using this choice for i (t) in (9) we conclude that the choices

hi (t) = ?i 1 (t)h i (t); ki (t) = ?i 1(t)ki (t);

10

satisfy the embedding relation (6), are uniformly bounded over t, and result in D(T21;i ) = 0 since the choice for i (t) forces ki (t) to be block-lower triangular or, equivalently, J (t)ki (t) J (t) to be block-upper triangular.

We should note, however, that the construction used in the previous proof is only one, among several possibilities, that would guarantee the condition D(T21 ) = 0. This is because the above construction achieves D(T21 ) = 0 by assuring that each individual section, or operator, satis es a similar condition, D(T21;i ) = 0; thus resulting in an overall cascade that satis es D(T21 ) = 0. But, as we shall show in an example in the next section, it is possible to have D(T21 ) = 0 without requiring all the individual sections to satisfy D(T21;i ) = 0. A. Strictly Lower-Triangular F (t) Let us rst concentrate on the case of strictly lower-triangular matrices F (t); viz., fi (t) = 0 for all t 2 ZZ and i = 0; : : : ; n ? 1: We begin with the additional assumption

dim Ri (t) = dim U (t) for all t 2 ZZ; i = 0; 1; : : :; n ? 1:

(11) 



The more general case can be similarly treated and we ommit the details. Let gi (t) = ui (t) vi (t) denote the top block row of Gi (t), and note that it follows from the displacement equation for Ri (t) that gi (t)J (t)gi (t) = di (t) > 0. This implies that there exists a uniquely determined matrix i (t), k i (t)k < 1; such that vi (t) = ui (t) i (t); (12)

and we can de ne the J (t)?unitary rotation H ( i (t)): It reduces the top row of Gi (t) to the form gi (t)H ( i (t)) = [ i (t) 0V (t) ], and we say that Gi (t) is reduced to proper form. This will allow us to further simplify the generator recursion (7) as detailed ahead. We shall refer to the i (t) as the Schur parameters associated with the displacement equation (1), when F (t) is strictly lower triangular. Consider further the following uniformly bounded choices (recall (9)),

ki (t) = I ? J (t)gi (t)d?i 1 (t)gi (t); h i (t) = J (t)gi (t)d?i 1=2 (t)i (t)d?i 1=2 (t ? 1);

(13)

where i (t) is unitary and i (t) = I . We further partition ki (t) accordingly with J (t), and introduce

11

the generalized re ection coecients

i (t) = ?ki(12) (t)(ki(22) (t))?1 :

(14)

Despite of the simple proof, the following result is quite unexpected. Theorem 2: Consider the setting of Theorem 1 and let R(t) be the unique Pick solution of (1), viz., R(t) > I > 0 for a constant  and for all t 2 ZZ: Assume further that F (t) is strictly lower-triangular

and dim Ri (t) = dim U (t) for all t 2 ZZ and i = 0; 1; : : : ; n ? 1: Then the Schur parameters f i (t)g, de ned via (12), and the generalized re ection coecients fi (t)g, de ned via (14), coincide,

i (t) = i (t) for t 2 ZZ; i = 0; 1; : : :; n ? 1: Proof: Since dim Ri (t) = dim U (t) for all t 2 ZZ; i = 0; 1; : : : ; n ? 1; and ui (t)ui (t)   + vi (t)vi (t) for a certain  > 0, we get that ui (t) are invertible matrices. Consequently,

i (t) = ui (t)di?1 (t)vi (t)(I + vi (t)d?i 1 (t)vi (t))?1 ; = ui (t)(ui (t)(I ? i (t) i (t))ui (t))?1 ui (t) i (t) 

(I + i(t)ui (t)(ui (t)(I ? i (t) i (t))ui (t))? ui (t) i (t))? ; = (I ? i (t) i (t))? i (t)(I + i (t)(I ? i (t) i (t))? i (t))? ; = (I ? i (t) i (t))? i (t)(I ? i (t) i (t)) = i (t): 1

1

1

1

1

1

This result also follows by noting that the generator recursion (7) gets simpli ed once we incorporate into it the special choice i (t) = H ( i (t)) and use (9) to write

ki (t) = H ( i (t))?1 ki (t); hi (t) = H ( i (t))?1 h i (t); where fh i (t); ki (t)g are as in (13). We readily conclude 2

J (t)ki (t)J (t) = H ( i (t)) 64

3

0 07  ?1=2 (t ? 1)  (t)d?1=2 (t) [  (t) 0 ] : 5 ; hi (t)J (t) = di i i i 0 I

Because of the assumption dim Ri (t) = dim U (t) for all t 2 ZZ; i = 0; 1; : : : n ? 1, and the fact that i (t)i (t) = gi (t)J (t)gi (t) = di (t); it follows from a simple Schur complement argument that I ? i (t)d?i 1 (t)i (t) = 0: These facts further allow us to choose the unitary matrix i (t) so as to

12

satisfy the relation i (t ? 1)d?i 1=2 (t ? 1)i (t) = i (t)di?1=2 (t), and the generator recursion (7) gets simpli ed to the following "

#

"

I 0#

"

0 0# = Fi (t)Gi (t ? 1)H ( i (t ? 1)) + Gi (t)H ( i (t)) : Gi+1 (t) 0 0 0 I It is thus clear that 0

2

0 0 J (t)ki (t)J (t) = H ( i (t)) 64 0 I

3

2

7 5

= 64

(15)

3

0 ? i (t) [I ? i (t) i (t)]?1=2 7 5; 0 [I ? i (t) i (t)]?1=2

is block upper-triangular, and the entire cascade will exhibit D(T21 ) = 0. We can also obtain an expression for the value of (3). Theorem 3: Consider the setting of Theorem 2 and  = [tt ]t2ZZ denote the optimal diagonal operator,  = max fD( S ) D( S g: S 2S

Then





tt = [I ? 0(t) 0 (t)] 2 [I ? 1 (t) 1 (t)] 2 : : : I ? n?1 (t) n?1 (t) : : : 1

1

: : : [I ? 1 (t) 1 (t)] 2 [I ? 0 (t) 0 (t)] 2 : 1

1

Proof: Let Ti denote the ith section associated with the proper generator recursion (15). We already know that the central solution of the corresponding cascade T coincides with the maximum entropy solution and, consequently,  = (D(T22 ) )?1 (D(T22 ))?1 . But, for each section Ti , we have [D(T22;i )]tt = (I ? i (t) i (t))? 21 . Therefore,

[D(T22 )]tt = (I ? 0 (t) 0 (t))? 21 (I ? 1 (t) 1 (t))? 12 : : : (I ? n?1 (t) n?1 (t))? 21 ; and the required result now follows. The previous discussion can be extended even if we drop assumption (11), viz., that dim Ri (t) = dim U (t) for all t 2 ZZ and i = 0; 1; : : : ; n ? 1: We omit the details here. We may add that the case of strictly lower-triangular F (t) covers the band completion problems studied in [16], as well as some contractive extension problems considered in [6, 9, 19] - see [4, 3] for details. It is also connected with the so-called time-domain model validation problem (see e.g. [4]). B. Lower-Triangular F (t)

13

The notion of proper generators can also be extended, under additional assumptions, to the case of lower-triangular F (t) (i.e., an F (t) that is not necessarily strictly lower-triangular) [3]. However, as the formulae (24) and (26) in [3] show, the associated proper recursion does not lead to uppertriangular terms J (t)ki (t)J (t) and, consequently, the individual sections Ti will not satisfy the requirement D(T21;i ) = 0. This means that the central solution of the cascade T that is constructed via the proper recursion, and using the classical Schur parameters, will not generally correspond to the maximum-entropy solution. The best illustration of this case is the consideration of the classical Nevanlinna recursion, which maps Schur functions si (z ) (i.e., functions that are analytic and bounded by unity in the unit disc) to Schur functions si+1 (z ) as follows:  (16) si+1 (z ) = 1z??ffi z 1s?i (z )?s ( zi) ; i = si (fi ) ; s0 (z ) = s(z ) ; i  0: i i i This relation can be linearized by expressing si (z ) as the ratio of two power series, si (z ) = vi (z )=ui (z ):

It follows from (16) that we can also write 







2

z?fi 6 1?fi z i 4

(z ? fi ) ui+1 (z ) vi+1 (z ) = ui (z ) vi (z ) H ( )

0

0 1

3 7 5

;

(17)

where H ( i ) is the elementary hyperbolic rotation, 2

? i

1

H ( i ) = p 1 2 64 1 ? j i j ? i

1

3 7 5

vi (z ) : ; i = zlim !f ui (z ) i

We see that each step of (17) gives rise to a rst-order J ?lossless section with transfer function [5] 2

z?fi 6 1?fi z i 4

Ti (z) = H ( )

0

3

07 5: 1

The resulting cascade T(z ) that can be associated with n steps of the above recursion is given by (see [5] for details, where these cascades were discussed in the context of time-invariant displacement equations of the form R ? FRF  = GJG )

T(z) = T (z)T (z) : : : Tn? (z): 0

1

1

Let us partition T(z ) accordingly with J = (1  ?1),

3

2

T(z) = 64 T (z) T (z) 75 ; T (z) T (z) 11

12

21

22

14

and consider its central solution

T[0] = ? TT ((zz)) : 12

22

[Remark. The notation T[0] for the central solution should not be confused with T(0), the value of T(z) at z = 0]. The question of interest is whether this central solution, which corresponds to the classical Schur parameters f i g, has the maximum-entropy property. According to Lemma 2, the central solution coincides with the maximum-entropy solution if, and only if, T21 (z ) is a strictly proper rational matrix function or, equivalently, T21 (0) = 0. So let us verify if this condition is always met in the Nevanlinna case. For this purpose, we focus only, and without loss of generality, on the rst two sections. That is, assume we have n = 2. This leads to a cascade T(z ) = T0 (z )T1 (z ), 2

B0 (z ) 0

T(z) = H ( ) 64 0

0

1

3

2

7 5

H ( 1 ) 64

B1 (z ) 0 0

1

3 7 5

;

whose (2; 1) entry is then equal to p

T (0) = ? p1 ?1 j j

p

21

0

Therefore,

1 [  B (z )B (z ) + 1 B1 (z )] : 1 ? j 1 j2 0 0 1

T (z) = ? p1 ?1 j j

2

1

[  B (0)B (0) + 1 B1 (0)] ; 1 ? j 1 j2 0 0 1 and it is clear that, in general, we have T21 (0) 6= 0; thus con rming our earlier claim that the central solution of the Nevanlinna cascade does not coincide, in general, with the maximum-entropy solution. It is also clear that if f1 = 0 and, consequently, B1 (z ) = z , then T21 (0) = 0 and the central solution will coincide with the maximum-entropy solution. We now show how to use our earlier results in order to modify the Nevanlinna recursion and obtain an algorithm that leads to a cascade whose central solution coincides with the maximumentropy solution. To clarify this, we rst elaborate on the connection of the Schur parameters f i g 1+f and the generalized re ection coecients fi g. Indeed, we choose i = I and i = 1+ f  in (9) and 1+f  ?1 write ki = I ? Jgi d?i 1 (1 ? 1+ f  fi ) gi . The generalized re ection coecient is then related to the Schur parameter i via  (18) i = 1 +1 +f fj i j2 i : 21

0

2

i

i

i

i

i

i





This leads to the choices hi = H (i )?1 d?i 1 Jgi ; ki = H (i )?1 ki , and to the rst-order sections (see [5] for details),



T;i (z) = I + [Bi(z) ? 1] gJgiJgi gi H (i ); Bi(z) = 1z??ffiz : i

15

i

(19)

These sections are related to the earlier Ti (z ) via T;i (z ) = Ti (z )H ( i )?1 H (i ). The corresponding generator recursion is given by "   0 #  i gi = Gi + (i ? I )Gi Jg (20) g Jg H (i ):

Gi+1

i

A simple computation shows that

2

 2 H ( i )?1 H (i ) = j1 + fi2j i j 2j 21 (1 ? jfi j j i j )

6 6 4

i

1 1+fi i 2

j j

?

fi i

?fi i 3 j i j2

1+fi

7 7 5

:

1 1+fi i 2

j j j 1+fi j i j2 j If we de ne i = 1+fi j i j2 ; ci = fi i , then the generator recursion (20) leads to a modi ed 1+fi

j i j2

?

Nevanlinna recursion of the type: i si+1 (z ) + ci = 1 ? fi z si (z ) ? i ; = s (f ); s = s; i  0: (21) 0 1 + ci i si+1 (z ) z ? fi 1 ? i si (z ) i i i The central solution of the cascade associated with this modi ed recursion now coincides with the maximum-entropy solution. We should mention that a detailed analysis of this type of recursions appears in [17], where it is shown that (21) facilitates the study of the Nevanlinna-Pick problem for an in nite number of data. IV. Concluding Remarks

We have shown that the displacement structure theory allows a general formulation of the maximum entropy problem and yields both global and recursive solutions. A new set of contractive coecients has also been shown to arise in this context, and which are di erent from those encountered in other applications of the displacement theory, e.g., in factorization and interpolation problems.

References [1] J. P. Burg. Maximum entropy spectral analysis. In Proc. 37th Meeting of Society of Exploration Geophysicists, Oklahoma City, Okla., October 1967. [2] T. Kailath and A. H. Sayed. Displacement structure: Theory and applications. SIAM Review, vol. 37, no. 3, pp. 297{386, September 1995. [3] A. H. Sayed, T. Constantinescu, and T. Kailath. Time-variant displacement structure and interpolation problems. IEEE Transactions on Automatic Control, 39(5):960{976, May 1994. 16

[4] T. Constantinescu, A. H. Sayed, and T. Kailath. Displacement structure and completion problems. SIAM J. Matrix Anal. Appl., vol. 16, no. 1, pp. 58{78, Jan. 1995. [5] A. H. Sayed, T. Kailath, H. Lev-Ari, and T. Constantinescu. Recursive solutions of rational interpolation problems via fast matrix factorization. Integral Equations and Operator Theory, 20:84{118, October 1994. [6] I. Gohberg, M. A. Kaashoek, and H. J. Woerdeman. A maximum entropy principle in the general framework of the band method. J. Functional Analysis, 95:231{254, 1991. [7] C. Foias, A. Frazho, and I. Gohberg. Central intertwining lifting, maximum entropy and their permanence. Integral Equations and Operator Theory, 18:166{201, 1994. [8] P. A. Iglesias. An entropy formula for time-varying discrete-time control problems. Proc. 28th Conf. Information Sciences and Systems, pp. 214{219, Princeton, NJ, Mar. 1994. [Also to appear in SIAM J. Control and Optimization, Sep. 1996]. [9] P. Dewilde and H. Dym. Interpolation for upper triangular pperators, Operator Theory: Advances and Applications, ed. I. Gohberg, Birkhauser Verlag Basel, vol. 56, pp. 153{ 260, 1992. [10] T. Constantinescu. Schur analysis of positive block-matrices. Operator Theory: Advances and Applications, 18:191{206, 1986. edited by I. Gohberg, Birkhauser, Boston. [11] M. Rosenblum and J. Rovnyak. Hardy Classes and Operator Theory. Oxford Univ. Press, 1985. [12] B. Sz. Nagy and C. Foias. Harmonic Analysis of Operators on Hilbert Space. North Holland Publishing Co., Amsterdam-Budapest, 1970. [13] D. Z. Arov and M. G. Krein. On the evaluation of entropy functionals and their minima in generalized extension problems. Acta Sci. Math (Szeged), 45:33{50, 1983. [14] H. Dym. J-Contractive Matrix Functions, Reproducing Kernel Hilbert Spaces and Interpolation, CBMS, American Mathematical Society, vol. 71, RI, 1989. [15] J. A. Ball, I. Gohberg, and M. A. Kaashoek. Nevanlinna-Pick interpolation for timevarying input-output maps: The discrete case. Operator Theory: Advances and Applications, 56:1{51, 1992. Birkhauser Verlag Basel, ed. I. Gohberg. 17

[16] H. Dym and I. Gohberg. Extensions of band matrices with band inverses. Linear Algebra and Its Applications, 36:1{24, 1981. [17] J. B. Garnett. Bounded Analytic Functions. Academic Press, NY, 1981. [18] H. Lev-Ari and T. Kailath. Lattice lter parametrization and modeling of nonstationary processes. IEEE Transactions on Information Theory, 30(1):2{16, January 1984. [19] J. A. Ball and I. Gohberg. A commutant lifting theorem for triangular matrices with diverse applications. Integral Equations and Operator Theory, 8:205{267, 1985.

18