transform domain lms algorithms for sparse system ... - IEEE Xplore

Report 5 Downloads 102 Views
TRANSFORM DOMAIN LMS ALGORITHMS FOR SPARSE SYSTEM IDENTIFICATION Kun Shi† and Xiaoli Ma‡ †

Texas Instruments, Dallas, TX, USA Email: [email protected] ‡ School of Electrical and Computer Engineering, Georgia Tech, Atlanta, GA, USA Email: [email protected] ABSTRACT This paper proposes a new adaptive algorithm to improve the least mean square (LMS) performance for the sparse system identification in the presence of the colored inputs. The l1 norm penalty on the filter coefficients is incorporated into the quadratic LMS cost function to improve the LMS performance in sparse systems. Different from the existing algorithms, the adaptive filter coefficients are updated in the transform domain (TD) to reduce the eigenvalue spread of the input signal correlation matrix. Correspondingly, the l1 norm constraint is applied to the TD filter coefficients. In this way, the TD zero-attracting LMS (TD-ZA-LMS) and TD reweighted-zero-attracting LMS (TD-RZA-LMS) algorithms result. Compared to ZA-LMS and RZA-LMS algorithms [10], the proposed TD-ZA-LMS and TD-RZA-LMS algorithms have been proven to have the same steady-state behavior, but achieve faster convergence rate with non-white system inputs. Effectiveness of the proposed algorithms is demonstrated through computer simulations. Index Terms— least mean square (LMS), l1 norm, sparsity, adaptive filters 1. INTRODUCTION The least mean square (LMS) algorithm is one of the most popular methods for adaptive system identification [1]. In many applications, the unknown system response can be assumed to be sparse, containing many near-zero coefficients and few large ones, for instance, digital TV transmission channels [2] and echo paths [3]. Exploiting the sparse nature of a system can improve the adaptive filter performance and in the literature, there are many algorithms using system sparsity in different ways. For example, [4] applies a sequential partial updating scheme during the filtering process. [3, 5] use the proportionate updating, which adapts each coefficient with an adaptation gain proportional to its own magnitude. [6, 7] locate and track non-zero coefficients by dynamically adjusting the filter length. Based on the recent research on the least absolute shrinkage and selection operator (LASSO) [8] and compressive sens-

978-1-4244-4296-6/10/$25.00 ©2010 IEEE

3714

ing [9], [10] proposes an LMS algorithm with l1 norm constraint to speed up the sparse system identification. The l1 norm penalty is incorporated into the cost function of the standard LMS algorithm, which results in the LMS update with a zero attractor, naming zero-attracting LMS (ZA-LMS) and reweighted-zero-attracting LMS (RZA-LMS). The ZA-LMS and RZA-LMS algorithms have the same convergence condition as the standard LMS, which is dominated by the eigenvalue spread of the system input correlation matrix. It is well known that the colored inputs tend to deteriorate the convergence performance of LMS adaptive filters [1]. To the best of our knowledge, there is no work done aiming to accelerate the LMS convergence for sparse system with colored inputs. Thus, this paper is motivated to address the sparse LMS with fast convergence in the presence of correlated inputs. It has been shown in [11] that transforming the input signal into another domain can reduce the eigenvalue spread of input signal correlation matrix, which, consequently, accelerates the convergence of adaptive filters. Thus, this paper investigates the sparse LMS in the transform domain (TD). Specifically, we correlate the filter coefficients in the time domain and TD and propose TD-ZA-LMS and TD-RZA-LMS algorithms. Notations: In the following parts of the paper, matrices are denoted by boldface upper case letters; vectors in time domain and TD are denoted by italic boldface lower case letters and italic boldface upper case letters, respectively; the superscripts (·)H , (·)T , and (·)−1 denote the Hermitian, transpose, and inverse operator, respectively;  · 1 denotes the l1 norm; and E[·] denotes the statistical expectation. 2. NEW LMS ALGORITHM 2.1. Review of the TD LMS algorithm Consider a linear system with its input signal x(n) and output signal d(n) related by d(n) = hT x(n) + v(n),

(1)

where h = [h0 , ..., hL−1 ]T is the unknown system with memory length of L, x(n) = [x(n), ..., x(n − L + 1)]T is the system input vector, and v(n) is the additive noise. The goal of

ICASSP 2010

LMS-type adaptive filters is to estimate the unknown system coefficients vector h using the input signal x(n) and output signal d(n). In the TD LMS algorithm [11], the input signal vector x(n) is first transformed into another vector X(n) using an orthogonal transformation: X(n) = Fx(n),

(2)

Using the gradient descent updating, the filter coefficients are updated as W (n + 1) = W (n) − U

∂J1 (n) ∂W H (n)

   =W (n) + Ue(n)X(n) − λUFT sgn Re FT W (n)    =W (n) + μΛ−1 e(n)X(n) − ρΛ−1 FT sgn Re FT W (n) (9)

where F is a unitary matrix. For example, F can be discrete Fourier transform (DFT) or discrete Cosine transform (DCT) matrix. Denoting W (n) to be the TD filter coefficients vector, the estimate of d(n) is represented by the TD filter coefficients W (n) and the TD signal X(n)

where ρ = μλ and sgn(·) is a component-wise sign function defined as  x/|x| x = 0 sgn(x) = . (10) 0 x=0

ˆ = W H (n)X(n), d(n)

Similar to the ZA-LMS algorithm in [10], the second term in (8) attracts the filter coefficients to zero, which speed up convergence when the majority of the coefficients h are zero, i.e., the system is sparse. However, different from the ZA-LMS algorithm, the proposed method updates the filter coefficients in the TD. The purpose is to improve the filter convergence rate for non-white input signals. Thus, we call the proposed algorithm TD-ZA-LMS, and the entire algorithm is described sequentially by (2), (4), and (9).

(3)

and the corresponding estimation error is ˆ = d(n) − W H (n)X(n). e(n) = d(n) − d(n)

(4)

The filter coefficients of the TD LMS are then updated by W (n + 1) = W (n) + Ue(n)X(n),

(5)

where U = μΛ−1 is the step size matrix, and Λ is an L × L diagonal matrix whose (i, i)th element equals the ith TD signal component power E[|Xi (n)|2 ]. Denote RXX and Rxx to be the autocorrelation matrix of X(n) and x(n), respectively, and χ(A) to be the eigenvalue spread of a matrix A, which is defined as χ(A) = λmax /λmin ,

(6)

where λmax and λmin are the maximum and minimum eigenvalues of matrix A, respectively. Denote γ(A) to be the upper bound of χ(A). It has been shown that [11] γ(Λ−1 RXX ) = γ(Λ−1 FRxx FH ) ≤ γ(Rxx ),

2.3. TD-RZA-LMS algorithm It has been shown that the ZA-LMS algorithm achieves a biased estimate of h and the bias limits the filter steady-state performance [10]. Correspondingly, [10] proposes the RZALMS algorithm to reduce the bias via a cost function with a l0 norm-like behavior. It will be shown later, the similar bias exists for TD-ZA-LMS (see Section 2.4). Thus, similar to RZA-LMS algorithm, to reduce the estimation bias of the TD-ZA-LMS algorithm, we propose the TD-RZA-LMS algorithm via a new cost function:

(7)

which indicates that the TD LMS algorithm can be expected to have better convergence properties than the corresponding time domain one.

J2 =

where F i is the ith column of matrix F. Note that the log penalty behaves more like l0 norm than l1 norm. The TD coefficients vector is then updated by

2.2. TD-ZA-LMS algorithm In this section, the sparse system identification using LMS algorithms is investigated. Motivated by the better convergence of the TD LMS algorithm, we develop a TD sparse LMS algorithm. We propose a new cost function J1 (n) by combining the instantaneous square error with the l1 norm penalty of the filter coefficients vector:    1 J1 (n) = e2 (n) + λ Re FH W (n) 1 , (8) 2 where Re[·] denotes the real part of a complex quantity. Note that to exploit sparsity, the TD coefficients vector W (n) is transformed back to time domain.

L

 1 2 log 1 + Re F H e (n) + λ i W (n) ε , (11) 2 i=1

W (n + 1) = W (n) − U

∂J2 (n) ∂W H (n)



L F sgn Re F H W (n)  i i

=W (n) + Ue(n)X(n) − λUε 1 + εRe F H W (n) i=1 i =W (n) + μΛ−1 e(n)X(n)

   − ρFT Φ−1 (n)sgn Re FH W (n) ,

(12)

where ρ = μλε and Φ(n) is an L × L diagonal matrix with its (i, i)th element to be

Φi,i (n) = 1 + εRe F H (13) i W (n) , i = 1, ..., L.

3715

Note that the second term in eq. (11) attracts only those H time domain coefficients Re F i W (n) whose magnitudes are comparable to 1/ε to zeros. Different from TD-ZALMS algorithm, the penalty in TD-RZA-LMS enables it to distinguish between zero taps and non-zero taps. Thus, the bias of TD-RZA-LMS is reduced and the better steady-state performance results.

It can be seen that the estimation bias of the TD-RZA-LMS algorithm is reduced by the matrix Φ−1 (n) compared to the one of the TD-ZA-LMS algorithm, which leads to the better steady-state performance. Moreover, the corresponding time domain estimation representation of the biases in (20) and (21) are shown to be, respectively     ρ E[FH δ(∞)] = − R−1 E sgn Re FH W (n) , (22) μ xx     ρ −1 E[FH δ(∞)] = − R−1 (n)E sgn Re FH W (n) , xx Φ μ (23)

2.4. Convergence analysis Denoting W o to be the optimal filter coefficients in the TD, we know that W o = Fh.

(14)

Combining (14) and (2), we rewrite (1) into d(n) = W H o X(n) + v(n).

(15)

Define the TD coefficients vector estimation error as δ(n) = W (n) − W o .

(16)

Combining (15) and (16), we rewrite the signal estimation error in (4) as e(n) = −δ H (n)X(n) + v(n).

which indicate that TD-ZA-LMS and TD-RZA-LMS algorithms achieve comparable steady-state performance as ZALMS and RZA-LMS, respectively.

(17)

For the TD-ZA-LMS algorithm, substituting (17) into (9), we obtain δ(n + 1) = (IL − μΛ−1 X(n)X H (n))δ(n)    +μΛ−1 X(n)v(n)−ρΛ−1 FT sgn Re FH W (n) , (18) where IL is the L × L identity matrix. With the independence assumption, taking expectations on (18) leads to E[δ(n + 1)] = (IL − μΛ−1 RXX )E[δ(n)]     − ρΛ−1 FT E sgn Re FH W (n) . (19)     Since ρΛ−1 FT E sgn Re FH W (n) is bounded, E[δ(n)] can converge if the maximum eigenvalue of (IL −μΛ−1 RXX ) is less than 1. This indicates that the TD-ZA-LMS algorithm has the same requirements on step size μ for convergence as the TD-LMS algorithm. It can been shown that the same convergence condition also applies to the TD-RZA-LMS algorithm. As n approaches infinity in (19), we notice that an estimation bias results and it is obtained as    H  ρ T E[δ(∞)] = − R−1 . (20) XX F E sgn Re F W (n) μ For the TD-RZA-LMS algorithm, the estimation bias is obtained following a similar maner     ρ E[δ(∞)] = − R−1 FT Φ−1 (n)E sgn Re FH W (n) . μ XX (21)

3. SIMULATION RESULTS In this section, the performance of the proposed methods is assessed via computer simulations. Two experiments are designed to demonstrate the steady-state performance, convergence rate, and tracking ability of the proposed methods. For comparison purposes, we also implement the conventional LMS algorithm [1], ZA-LMS and RZA-LMS algorithms [10]. Mean square deviation (MSD) is taken as a metric, which is defined as   MSD(n)(dB) = 10 log10 E FH δ(n)22 . (24) First, we evaluate the performance of the proposed methods with a white signal as the system input. The simulation setup in [10] is adopted. The system is time varying and the channel response consists of 16 coefficients. In the first 500 iterations, we set the 5th tap with value 1 and others to zero, generating a system of sparsity 1/16. During the second 500 iterations, all the odd taps are set to be 1 while all the even taps are set to zero, i.e., a sparsity of 8/16. After 1000 iterations all the even taps are set to be -1 while all the odd taps remain to be 1, generating a completely non-sparse system. The filter input is a zero-mean i.i.d. Gaussian process with unit variance. The noise is another white zero-mean Gaussian process such that the signal-to-noise (SNR) ratio is 30 dB. The parameters are set as μ = 0.05, ρ = 5 × 10−4 , and ε = 10 [10]. Note that we use the same parameters for all the LMS filters. In the proposed methods, we use DFT as the orthogonal transform, i.e. F in (2) is the DFT matrix. The MSD curves are shown in Fig. 1. We observe that when the system is sparse, the sparse-type LMS algorithms (ZA-LMS, RZA-LMS, TD-ZA-LMS, and TD-RZA-LMS) perform better than the standard LMS in terms of both convergence rate and MSD. As the sparsity increases, the sparse-type LMS algorithms perform more and more comparably with the standard LMS. Moreover, the TD-ZA-LMS and TD-RZA-LMS algorithms achieves slightly faster convergence rate compared to their corresponding time domain ones.

3716

LMS ZAíLMS RZAíLMS TDíZAíLMS TDíRZAíLMS

5 0

0 í5

í5

í10 MSD (dB)

í10 MSD (dB)

LMS ZAíLMS RZAíLMS TDíZAíLMS TDíRZAíLMS

5

í15

í15 í20

í20 í25

í25

í30

í30

í35

í35

í40

200

400

600 800 1000 Number of samples

1200

1400

2000

4000

6000

8000 10000 12000 Number of samples

14000

16000

18000

Fig. 1. MSDs with the input signal of white process.

Fig. 2. MSDs with the input signal of an AR(1) process.

Second, we evaluate the proposed methods with a nonwhite system input. The input signal x(n) is an AR(1) process generated by filtering a white Gaussian noise through a first-order system 1/(1 − 0.8z −1 ). The system responses are set the same as the first experiment, except that the switching times are the 6000th iteration and the 12000th iteration, respectively. The filter parameters are set as μ = 0.015, ρ = 3 × 10−5 , and ε = 10 for LMS, ZA-LMS, and RZALMS, μ = 0.005 and ρ = 2.6 × 10−6 for TD-ZA-LMS, μ = 0.005, ρ = 7 × 10−6 , and ε = 10 for TD-RZA-LMS. Note that different parameters are used to make all the LMS algorithms generate similar steady-state error, so that the convergence rate can be compared. The MSD curves are depicted in Fig. 2. As we expected, both TD-ZA-LMS and TD-RZALMS converge faster than their corresponding time domain ones under all the three different channel sparsity, since the eigenvalue spread of the input correlation matrix is reduced through the orthogonal transform. Note that different orthogonal transform may result in different eigenvalue spread. The selection of the orthogonal transform is beyond the scope of this paper.

5. REFERENCES

4. CONCLUSIONS In order to improve the performance of sparse system identification performance, a new LMS algorithm is proposed in this paper. l1 norm is introduced into the cost function to exploit the system sparse feature. The filter coefficients are updated in the TD to reduce the eigenvalue spread of the input signal correlation matrix. Compared to the existing methods, the proposed methods exhibit better convergence and this advantage is demonstrated by computer simulations.

3717

[1] A. H. Sayed, Fundamentals of Adaptive Filtering, New York, Wiely, 2003. [2] W. F. Schreiber, “Advanced television systems for terrestrial braodcasting: some problems and some proposed solutions,” Proceedings of the IEEE, vol. 83, pp. 958–981, 1995. [3] D. L. Duttweiler, “Proportionate normalized least-meansquares adaptation in echo cancellers,” IEEE Trans. on SPeech Audio Processing, vol. 8, pp. 508–518, 2000. [4] D. M. Etter, “Identification of sparse impulse response systems using an adaptive delay filter,” Proc. IEEE ICASSP, pp. 1167– 1172, Tampa, Florida, Mar. 1985. [5] J. Benesty and S. L. Gay, “An improved PNLMS algorithm,” Proc. IEEE ICASSP, pp. 1881–1884, 2002. [6] Y. Li, Y. Gu, and K. Tang, “Parallel NLMS filters with stochastic active taps and step-sizes for sparse system identification,” Proc. IEEE ICASSP, pp. 109–112, Toulouse, France, May 2006. [7] O. A. Noskoski and J. Bermudez, “Wavelet-packet-based adaptive algorithm for sparse impulse response identification,” Proc. ICASSP, pp. 1321–1324, Honolulu, Hawaii, Apr. 2007. [8] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of Royal statistical Society, Series B, vol. 58, pp. 267–288, 1996. [9] D. Donoho, “Compressive sensing,” IEEE Trans. Information Theory, vol. 52, pp. 1289–1306, Apr. 2006. [10] Y. Chen, Y. Gu, and A. O. Hero, “Sparse LMS for system identification,” Proc. IEEE ICASSP, pp. 3125–3128, Taipei, Taiwan, Apr. 2009. [11] S. S. Narayan, L. M. Peterson, and M. J. Narasimha, “Transform domain LMS algorithm,” IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. ASSP-31, pp. 609–615, Jun. 1983.