ADAPTIVE FIR FILTERS WITH AUTOMATIC LENGTH OPTIMIZATION ...

Report 3 Downloads 73 Views
2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

October 18-21, 2009, New Paltz, NY

ADAPTIVE FIR FILTERS WITH AUTOMATIC LENGTH OPTIMIZATION BY MONITORING A NORMALIZED COMBINATION SCHEME Marcus Zeller1 , Luis A. Azpicueta-Ruiz2 and Walter Kellermann1 1

Multimedia Comm. and Signal Processing University of Erlangen-Nuremberg Cauerstr. 7, 91058 Erlangen, Germany

2

Dept. of Signal Theory and Communications Universidad Carlos III de Madrid 28911 Legan´es-Madrid, Spain

{zeller,wk}@LNT.de

[email protected]

ABSTRACT This paper presents a novel strategy of adaptive filtering which provides an automatic self-configuration of the filter structure in terms of memory length. By monitoring the adaptive mixing of a normalized combination of two competing filters with a different number of coefficients, an online estimate of the optimum filter length is obtained and used to dynamically scale the size of the employed filters. Furthermore, a more efficient, simplified version of this approach is proposed and shown to be equally effective while significantly reducing the required complexity. Experimental results for high-order real-world systems as well as stationary noise and speech signals demonstrate the good performance and the robust tracking behaviour of the outlined algorithms in the context of realistic system identification scenarios. Index Terms— Adaptive Filters, Variable Tap-Length, Acoustic Echo Cancellation, Scalability 1. INTRODUCTION Adaptive filters have been a major focus of research for quite some time which is due to their ability to adjust to unknown and time-variant environments. Over the years, many different adaptation algorithms have been developed which are usually based on transversal FIR filter structures as a consequence of their ensured stability properties [1]. Although many analyses are still carried out under the assumption that the adaptive filter and the unknown system are matched in size, several authors have also investigated algorithms for realizing and controlling filters that use a variable number of filter taps [2, 3]. However, these methods often still rely on some kind of a-priori assumption [4] or have not yet been investigated for systems with very high filter order [5] as present in many practical applications. In this paper, we propose a novel method for the automatic self-configuration of adaptive filters in terms of memory size by exploiting the inherent soft decision property of the adaptive mixing of a normalized filter combination. Since the presented method lends itself to an efficient, simplified version and also exhibits a good tracking behaviour, we exemplify the given robustness by a realistic acoustic echo cancellation (AEC) scenario. The rest of this paper is structured as follows: Sec. 2 presents the motivation for this work in an AEC scenario. The proposed structure for adaptive filtering with automatic configuration of the filter length is given in Sec. 3, whereas the actual control mechanism is outlined in Sec. 4. A simplified but more efficient version of this approach is then discussed in Sec. 5. Finally, Sec. 6 presents This work has been supported by the Deutsche Forschungsgemeinschaft (DFG) under contract number KE 890/5-1.

978-1-4244-3679-8/09/$25.00 ©2009 IEEE

149

selected experiments with noise and speech signals before the conclusions on these results are given in Sec. 7. 2. PROBLEM FORMULATION In order to motivate the proposed configuration method, we first review a typical system identification setup as depicted in Fig. 1  1 is to generate an appro(top). The task of the adaptive filter h(k) priate estimate y(k) which is then subtracted from the desired response d(k) as to achieve optimum cancellation of the undistorted system output y(k). Minimizing the residual error e(k) = d(k) − y(k) = y(k) − y(k) + n(k)

(1)

in the mean squared sense therefore yields the adaptation rule for the filter coefficients [1]. If this scenario is interpreted as an AEC problem, d(k) denotes the microphone recording including also n(k) which contains any background noise as well as local interferers. The acoustic echo y(k) is then created by a generally timevariant unknown system that is mainly determined by the impulse response of the acoustic enclosure.

x(k)

 h(k)

d(k)

Nopt y (k)

 (k) h A

 (k) h B

n(k)

y(k)

?

e(k)

eA (k)

y A (k)

η(k) e(k)

NA eB (k)

y B (k)

1 − η(k)

NB FLC

NCC

Figure 1: System identification scenario (top) and self-configuring adaptive structure (bottom) with filter length control (FLC) Under the assumption that the unknown system can be modelled by a causal FIR system g of finite length Nopt , the feedback component of the microphone signal is given by Nopt −1

y(k) =



n=0 1 i.e.

the vector of tap weights

gn · x(k − n)

(2)

2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

for each discrete-time sample k. Hence, the best performance in terms of system identifcation is also given by an adaptive transver sal filter h(k) with Nopt coefficients. However, without invoking a-priori knowledge, this optimum filter length is generally unknown for real systems. In light of this situation, three configurations of the filter length N can be identified: • N > Nopt , where the additional degrees of freedom inhibit proper and fast convergence (overmodelling case), • N = Nopt , where the unknown system and the adaptive model thereof are matched in terms of memory size, • N < Nopt , where the filter cannot identify the whole system and thus fails to fully compensate y(k) (undermodelling). Clearly, a matching of the adaptive filter length to that of the unknown system is highly desirable in order to prevent the drawbacks of both under- and overmodelling. 3. SELF-CONFIGURING ADAPTIVE STRUCTURE In order to achieve an automatic model match for real-world systems, we propose a new concept of adaptive filtering. As can be seen from Fig. 1 (bottom), the single adaptive filter is therefore replaced by a combination of two different parallel filters A and B. This combination of filters paradigm has already been investigated by several authors [6, 7]. However, in contrast to its usual application which yields enhanced overall performance by jointly exploiting the benefits of different adaptation settings, we employ this technique on filters that are driven with the same algorithmic parameters, but operate with different lengths NA and NB .  (k) where c ∈ {A, B} are Since both component filters h c operated independently [6], the corresponding outputs and errors read: N c −1   yc (k) = (3) hc,n (k) · x(k − n) n=0

ec (k) = d(k) − yc (k).

(4)

Consequently, both components are adapted using an NLMS with step size parameter α, i.e. α  hc,n (k) + (5) hc,n (k + 1) =  · ec (k) · x(k − n) Pc (k) where

Pc (k) :=

N c −1 

x2 (k − n).

(6)

n=0

In order to exploit the adaptation in both filters, the signals of the overall structure are created according to the combinations   y(k) = η(k) · yA (k) + 1 − η(k) · yB (k), (7)   e(k) = η(k) · eA (k) + 1 − η(k) · eB (k), (8) which yield the total output and overall error signal, respectively. Here, the convex mixing factor 1 (9) η(k) := 1 + e−a(k) is a time-variant scalar which follows sigmoid curve and hence 0 < η(k) < 1. The actual adjustment of this mixing is thereby controlled by the parameter a(k) which is updated in order to minimize the current overall error [6], i.e. μa ∂e2 (k) · (10) 2 ∂a(k) = a(k) + μa · e(k) · Δ(k) · η(k) · (1 − η(k)) (11)

a(k + 1) = a(k) −

150

October 18-21, 2009, New Paltz, NY

denotes the corresponding LMS-type adaptation. Interpreting the error difference Δe(k) := eB (k) − eA (k) as the input signal to the one-tap filter a(k), a normalized convex combination (NCC) [7] can be performed as well. Using this, (11) is transformed to a(k + 1) = a(k) +

μa · η(k) · (1 − η(k)) · e(k) · Δe(k) (12) PΔ (k)

where the normalization term PΔ (k) is typically smoothed by PΔ (k) = 0.9 · PΔ (k − 1) + 0.1 · Δe2 (k).

(13)

The NCC scheme implies both a simple choice of the mixing step size as 0 < μa < 2 and a more robust behaviour in unknown or time-varying noise conditions [7]. Moreover, (11) and (12) require a limitation of a(k) which is typically given by a(k) ∈ [−4, +4] in order to prevent a stalling of the combination updates [6]. Regarding (7) and (8), it is obvious that the performance of the total adaptive structure is governed by the value of η(k) which defines the amount of contribution for both components of this scheme. Since the adjustment of the mixing parameter is based on the most recent performance of both filters, this value of η(k) can also be understood as a soft decision in favor of the better filter. Therefore, this indicator can be used to scale the length of the adaptive filters unless some optimum structure is reached. 4. FILTER LENGTH CONTROL An efficient mechanism for the automatic configuration of the adaptive filter lengths can be derived by monitoring the evolution of the mixing parameter η(k) of the structure as depicted in Fig. 1 (bottom). Since a good tracking capability of the combination is generally desirable, the step size for the update of the sigmoid curve is selected to μa = 0.5. However, in order to obtain a reliable decision for the configuration of the filter memory, it is necessary to emphasize only the general trend in the combination. Therefore, the following control algorithm is based on   η(k) = λ · η(k − 1) + 1 − λ · η(k) (14) which denotes a version smoothed by the factor λ. By numerous experiments, very high values of this forgetting factor have been found to be quite robust and hence λ = 0.9999 has been used throughout this work. Moreover, a filter length control (FLC) requires that the memory sizes in (3) to (6) have to be replaced by their time-variant versions NA (k) and NB (k). Using these prerequisites, the actual FLC is implemented by a set of simple rules as detailed below. In the beginning, both filters are initialized with some reasonable lengths such that a distance Ndist := NB (0) − NA (0)

(15)

in size is achieved. Note that Ndist is kept constant throughout the whole processing and has to be selected large enough as to ensure a significantly different performance of the two components. Depending on the value of η(k), the lengths of both filters c ∈ {A, B} are then modified according to: ⎧ ⎨ Nc (k) + ΔN, if η(k) ≤ ηmin + εinc Nc (k) − ΔN, if η(k) ≥ ηmax − εdec Nc (k + 1) = ⎩ N (k), else. c (16) Here, ΔN denotes the number of coefficients by which the filters can be enlarged/reduced in each step and therefore also defines the opt (k). The threshold possible resolution of the length estimate N parameters εinc,dec can be specified in order to define the regions

2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

N (k)

NA (0) η (k)

twice the complexity of a single adaptive filter with length Nopt . Therefore, Fig. 3 depicts a simplified version where

Ndist

NB (0)

October 18-21, 2009, New Paltz, NY

Nopt K(k)

NA +Ndist −1



yB (k) = yA (k) +

ΔN

n=NA

 1

 2

 3

 2 ηmax

εdec εinc ηmin k

Figure 2: Example illustration for the operation of the FLC algorithm in different phases of the decision parameter η(k) of η(k) where increments/decrements take place. The minimum and maximum values ηmin and ηmax , however, are a consequence of the limitation of a(k) as described in Sec. 3. For each increase event, the new coefficients of B are initialized with zeros whereas the additional coefficients in A are taken directly from B in order to benefit from its superiority as indicated by η(k). On the other hand, in case of decreasing filter lengths, both filters are truncated by discarding the ΔN last coefficients at the end. In addition, the regression energies (6) are adjusted in order to remove transient effects. Finally, the next application of (16) is prevented for a waiting phase of K(k) = τ ·

NA (k) + NB (k) 2

 hdist,n−NA (k) ·x(k − n) (18)

 B,n (k) =h

employs yA (k) as given by (3) and denotes the output of a virtual  (k) having NB = NA + Ndist taps: filter h B   , for 0 ≤ n ≤ NA − 1 hA,n (k)  . hB,n (k) :=  hdist,n−NA (k), for NA ≤ n ≤ NB − 1 (19) Due to this simplification, the NLMS adaptation rules of the adaptive filter coefficients have to be implemented differently. All coefficients of component A are updated by α  hA,n (k + 1) =  hA,n (k) + · e(k) · x(k − n) (20) PA (k) which is based on the total residual error (8) as it is part of both the real and the virtual filter. In contrast, the additional coefficients 0 ≤ n ≤ Ndist − 1 of the virtual filter are adjusted by   α · 1 − η(k)  hdist,n (k) + hdist,n (k + 1) =  · eB (k) · x(k − n) PB (k) (21) which accounts for the fact that these are only incorporated in B. Note that the scaling by 1 − η(k) in (21) prevents the Ndist co (k) from diverging severely whenever the decision efficients of h B is made in favor of A. Therefore, this virtualized version requires only the complete filtering and updating of component A whereas opt (k) (via the Ndist additional coefficients and the estimation of N the FLC) can be computed with considerably reduced effort.

(17)

samples which accounts for the occurring reconvergence of the component filters after each change in memory size. In various experiments it has been found that choosing factors τ ∈ [1, 10] is sufficient in order to ascertain both a correct interpretation of η(k) and a fast tracking behaviour. In order to illustrate the self-configuration mechanism in more detail, this concept is exemplified by Fig. 2 which covers three phases of operation. First,  1 represents a phase of initial length increase as the size of both filters is below Nopt (dashed line). However, as NB (k) is closer to the true number of coefficients, η(k) indicates a better performance of component B and thus the length of the filters is increased. This situation basically persists until the beginning of phase  2 where the processing with B is no longer beneficial, as filter A is as well capable of representing all coefficients of the unknown system. Hence, η(k) remains at a center position between the regions defined by εinc and εdec . Although both filters exhibit an overmodelling in phase  3 where the memory requirements have decreased, the FLC can nevertheless track the change in memory size. This is explained by the fact that in this situation, A provides a more suitable representation as it contributes less coefficient noise [1]. Accordingly, the decision will switch to values close to ηmax and the length can be gradually decreased until another equilibrium  2 is obtained. Thus, an opt (k) = NA (k). estimate of the optimum filter size is given by N 5. SIMPLIFIED VERSION WITH VIRTUAL FILTER Despite its effectiveness in estimating the optimum filter length, the structure from Fig. 1 is rather inefficient as it requires roughly

151

 (k) h A

eA (k)

y A (k) NA

 h dist (k)

NA·T

η(k) e(k)

eB (k)

y B (k)

1 − η(k) Ndist FLC

NCC

Figure 3: Scheme of simplified FLC with a virtual filter B 6. RESULTS The effectiveness of the proposed methods for the automatic length optimization of adaptive filters is now demonstrated by some experimental results for a realistic AEC scenario at a sampling rate of 8 kHz. In order to focus on the FLC performance only, all simluations are carried out for noisy, single-talk environments and use NLMS algorithms with α = 0.1 unless stated otherwise. opt (k) is shown for a In Fig. 4, the temporal evolution of N white Gaussian noise excitation with an SNR of 30 dB for an FLC with both a real (top) and the virtual (bottom) filter implementation. The unknown system is modelled by a room impulse response (RIR) that has been measured in an audio lab with T60 = 300 ms and is truncated to Nopt = 500. The FLC is set up with thresholds εinc,dec = 0.1, increments ΔN = 10 and size distance Ndist = 40 due to NA (0) = 10, NB (0) = 50. As can be

2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

adaptation is done at an SNR of 20 dB using a step size α = 0.2, and the FLC parameters are given by ΔN = 50, τ = 3 and a size distance of Ndist = 200 taps where the estimation is initialized by NA (0) = 1000. As can be seen by the plot, the optimum lengths (dotted line) can be followed quite well for both white and speechlike coloured noise, although the very small coefficient values in the highest taps of the acoustic system cannot be fully detected.

45

0

τ τ τ τ τ

250

0

2

4

6

= = = = =

1 2 3 5 10

8 10 12 14 16 18 20 time[s]

ERLE(k) [dB]

 opt (k) N

FLC (τ = 3)

15 45

500

0

30

30

3000

15 0

0

1

2 3 4 time[s]

5

opt (k) and ERLE for self-configuring Figure 4: Estimation of N filters in real (top) and virtual (bottom) implementation

seen in the left-hand plots, the estimation performance of the FLC rarely depends on the waiting factor τ and is effective for both implementations, although the virtual version consumes significantly less computations throughout the whole operation. For illustration on the ERLE2 performance throughout the length changes, the plots to the right compare an FLC with τ = 3 to an adaptive filter that is a-priori matched in size with length Nopt . It can be seen that the convergence of the FLC is quite fast despite its initialization with a considerably shorter filter and that the reduced number of coefficients may even lead to accelerated convergence in the beginning of the adaptation.

 opt (k) N

600 εinc,dec εinc,dec εinc,dec εinc,dec

400

x(k)

200 1 0 -1 0

= = = =

0.10 0.15 0.20 0.25

2000 1500 1000 500 1 0.5 0 0

25

50

75 time[s]

100

125

150

Figure 6: Tracking ability of the FLC (virtual) and corresponding evolution of η(k) for white noise input 7. CONCLUSIONS A novel strategy of adaptive filtering for the identification of realworld systems with unknown and time-varying memory size has been proposed. By monitoring the inherent soft decision of a normalized filter combination scheme, we have developed a control mechanism for the dynamic adjustment of the adaptive filter length. Since this self-configuration mechanism is based on a set of simple thresholding rules, the algorithm provides a robust estimation and tracking performance. Furthermore, a computationally more efficient realization with a virtual component filter has been outlined and was shown to be equally effective in experiments with real impulse responses, noisy environments and both noise and speech signals. 8. REFERENCES

10

20

30 time[s]

40

50

60

Figure 5: Performance of the FLC (virtual) for speech input Similar results can be obtained for speech input as shown in Fig. 5 for an unknown system with Nopt = 600 coefficients. Due to the coloured and nonstationary nature of these signals, the threshold parameters εinc,dec have to be chosen more aggressively in response to the less distinctive differences between the component filters. The waiting factor has been selected as τ = 3 and increments of ΔN = 20 taps are used in connection with Ndist = 100 coefficients. Since the SNR has been adjusted to 25 dB here, these results demonstrate the effectiveness of the FLC algorithms for realistic AEC tasks. Note that the robustness for nonstationary signals could be improved further by smoothing the decision variable more heavily (e.g. λ = 0.99995). Finally, Fig. 6 illustrates the tracking ability of the FLC. Here, the unknown system has been modelled by a very long RIR of size Nopt = 2500 that has been truncated to 350 and extended to 1100 coefficients after 50 and 100 seconds, respectively. The 2 ERLE

white (εinc,dec = 0.10) coloured (εinc,dec = 0.10) coloured (εinc,dec = 0.25)

2500  opt (k) N

250

Nopt

η (k)

ERLE(k) [dB]

 opt (k) N

500

October 18-21, 2009, New Paltz, NY

= echo return loss enhancement

152

[1] S. Haykin, Adaptive Filter Theory. New Jersey: Prentice Hall, 2002 (4th Edition). [2] Z. Pritzker and A. Feuer, “Variable length stochastic gradient algorithm,” IEEE Trans. on Signal Processing, vol. 39, no. 4, pp. 997– 1001, April 1991. [3] Y. Gong and C. F. N. Cowan, “An LMS style variable tap-length algorithm for structure adaptation,” IEEE Trans. on Signal Processing, vol. 53, no. 7, pp. 2400–2407, July 2005. [4] Y. Zhang, J. A. Chambers, S. Sanei, P. Kendrick, and T. J. Cox, “A new variable tap-length LMS algorithm to model an exponential decay impulse response,” IEEE Signal Processing Letters, vol. 14, no. 4, pp. 263–266, April 2007. [5] Y. Zhang and J. A. Chambers, “Convex combination of adaptive filters for a variable tap-length LMS algorithm,” IEEE Signal Processing Letters, vol. 13, no. 10, pp. 628–631, October 2006. [6] J. Arenas-Garc´ıa, V. G´omez-Verdejo, and A. R. Figueiras-Vidal, “New algorithms for improved adaptive convex combination of LMS transversal filters,” IEEE Trans. on Instrumentation and Measurement, vol. 54, no. 6, pp. 2239–2249, Dec. 2005. [7] L. A. Azpicueta-Ruiz, A. R. Figueiras-Vidal, and J. Arenas-Garc´ıa, “A normalized adaptation scheme for the convex combination of two adaptive filters,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, April 2008, pp. 3301–3304.