Signal Processing 93 (2013) 511–516
Contents lists available at SciVerse ScienceDirect
Signal Processing journal homepage: www.elsevier.com/locate/sigpro
Fast communication
A widely linear model for stereophonic acoustic echo cancellation c ¨ Cristian Stanciu a, Jacob Benesty b, Constantin Paleologu a,n, Tomas Gansler , Silviu Ciochin˘a a a
University Politehnica of Bucharest, 1-3, Iuliu Maniu Blvd., 061071 Bucharest, Romania INRS-EMT, University of Quebec, Montreal, QC, Canada H5A 1K6 c mh Acoustics, 25-A Summit Avenue, Summit, NJ 07901, USA b
a r t i c l e i n f o
abstract
Article history: Received 8 March 2012 Received in revised form 4 June 2012 Accepted 17 August 2012 Available online 29 August 2012
The stereophonic acoustic echo, due to the coupling between two loudspeakers and two microphones, can be modelled by a two-input/two-output system with real random variables. In this paper, we recast the problem as a single-input/single-output system with complex random variables, by using the widely linear (WL) model, and propose a new distortion method that fits well in this context. In order to illustrate the behavior of this scheme, the recursive least-squares (RLS)-dichotomous coordinate descent (DCD) algorithm is used. Experimental results indicate that the RLS-DCD algorithm represents an attractive choice for this application since it has good numerical features in terms of stability and complexity. & 2012 Elsevier B.V. All rights reserved.
Keywords: Stereophonic acoustic echo cancellation Widely linear (WL) model Nonlinear distortion Recursive least-squares (RLS)-dichotomous coordinate descent (DCD) algorithm
1. Introduction Research and development of stereophonic acoustic echo cancellation (SAEC) systems have been a subject of interest over the last two decades [1,2]. The stereo transmission, which can provide schemes for telepresence along with our binaural hearing system, is becoming very popular in hands-free teleconferencing systems. In the usual approach, an SAEC system consists of four adaptive filters aiming at identifying four echo paths from two loudspeakers to two microphones. For each microphone in the receiving (i.e., near-end) location, the SAEC consists of the identification of a two-input unknown system, consisting of the parallel combination of two acoustic echo paths (from the two loudspeakers to the microphone). The main challenge of SAEC is that the two channels may carry linearly related signals, which in turn may n
Corresponding author. E-mail addresses:
[email protected] (C. Stanciu),
[email protected] (J. Benesty),
[email protected] (C. Paleologu), ¨
[email protected] (T. Gansler),
[email protected] (S. Ciochin˘a). 0165-1684/$ - see front matter & 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.sigpro.2012.08.017
make the normal equation, to be solved by the adaptive algorithm, singular. This implies that there is no unique solution to the equation (as in the single-channel case) but an infinite number of solutions [3]. This nonuniqueness problem can be solved by using a preprocessor on the loudspeaker signals in order to reduce their coherence and thereby remove the singularity [4]. Of course, this distortion should not affect too much the stereo perception and the sound quality. In this paper, we present a different approach for SAEC by using the widely linear (WL) model [5,6]; in this framework, the classical two-input/two-output scheme with real random variables is recasted as a single-input/ single-output system with complex random variables. Also, we develop a new nonlinear distortion method which is more suitable for the WL model of SAEC. Besides, we propose to use the recursive least-squares (RLS)dichotomous coordinate descent (DCD) [7,8] as an alternative to the classical choice for SAEC, i.e., the fast RLS (FRLS) algorithm [1,2]. Simulation results indicate that the RLS-DCD algorithm represents an attractive choice for SAEC, in terms of convergence rate, stability, and complexity.
512
C. Stanciu et al. / Signal Processing 93 (2013) 511–516
2. The WL model for SAEC Let us consider the stereophonic setup, where we have two input or loudspeaker signals denoted by xL ðnÞ and xR ðnÞ (i.e., ‘‘left’’ and ‘‘right’’), and two output or microphone signals denoted by dL ðnÞ and dR ðnÞ, where n is the time index. In the receiving location, the microphone signals are obtained as dL ðnÞ ¼ yL ðnÞ þ vL ðnÞ,
ð1Þ
dR ðnÞ ¼ yR ðnÞ þ vR ðnÞ,
ð2Þ
where yL ðnÞ and yR ðnÞ denote the stereo echo signals, and vL ðnÞ and vR ðnÞ are the near-end signals (i.e., noise or a combination of noise and near-end speech). The echo signals can be modelled as [3,4] T
T
T
T
yL ðnÞ ¼ ht,LL xL ðnÞ þ ht,RL xR ðnÞ, yR ðnÞ ¼ ht,LR xL ðnÞ þht,RR xR ðnÞ,
ð3Þ ð4Þ
where ht,LL ,ht,RL ,ht,LR ,ht,RR are L-dimensional vectors of the loudspeaker-to-microphone (true) acoustic impulse responses, the superscript T denotes transposition, and xL ðnÞ ¼ ½xL ðnÞ xL ðn1Þ xL ðnL þ1ÞT xR ðnÞ ¼ ½xR ðnÞ xR ðn1Þ xR ðnL þ1ÞT
comprise the L most recent loudspeaker signal samples. In order to cancel the echo, we need to estimate the four acoustic impulse responses, ht,LL ,ht,RL ,ht,LR ,ht,RR , from the microphone signals dL ðnÞ and dR ðnÞ. We propose to recast the classical two-input/two-output scheme (with real random variables) as a singleinput/single-output system with complex random variables (CRVs). First, we can form the CRV dðnÞ ¼ dL ðnÞ þjdR ðnÞ ¼ yðnÞ þ vðnÞ, ð5Þ pffiffiffiffiffiffiffi where j ¼ 1, yðnÞ ¼ yL ðnÞ þ jyR ðnÞ, and vðnÞ ¼ vL ðnÞ þ jvR ðnÞ. Next, let us define the complex random vector xðnÞ ¼ xL ðnÞ þjxR ðnÞ:
ð6Þ
In this context, the (complex) echo signal can be obtained as H
0H
yðnÞ ¼ ht xðnÞ þ ht xn ðnÞ,
h0t
¼ h0t,1 þ jh0t,2 ,
ð8Þ
with ht,1 ¼ ðht,LL þ ht,RR Þ=2, ht,2 ¼ ðht,RL ht,LR Þ=2, ¼ ðht,LL ht,RR Þ=2, and h0t,2 ¼ ðht,RL þ ht,LR Þ=2. Alternatively, we can express (7) as ð10Þ
T 0T ~ ¼ ½xT ðnÞ xH ðnÞT . Therefore, where h~ t ¼ ½ht ht T and xðnÞ the complex observation is H ~ dðnÞ ¼ h~ t xðnÞ þ vðnÞ:
ð12Þ
be the error signal at time n. In this context, Fig. 1 depicts the proposed WL model for SAEC. As compared to the classical SAEC approach [3,4], which requires four adaptive filters of length L, the proposed model involves only one filter of length 2L. On the other hand, we are dealing now with CRVs having both real and imaginary parts. Apparently, the overall complexity of the proposed model is similar to the classical approach. However, there are many other aspects that should be taken into account in practice, e.g., numerical effects (since in SAEC we use in general RLS-based algorithms which encounter specific numerical problems in finite precision), implementation issues, memory usage, etc.; consequently, it could be more convenient to handle only one adaptive filter instead of four such systems. Moreover, as we will show in the next section, there are other specific features of the CRVs which could be exploited in order to improve the overall performance of the SAEC scheme. To end this section, let us redefine in the context of the WL model one of the most used performance measure in echo cancellation, which is the so-called normalized misalignment [1]. It quantifies directly how ‘‘well’’ (in terms of convergence, tracking, and accuracy to the solution) an adaptive filter converges to the impulse response of the system that needs to be identified. The normalized misalignment (in dB) in the WL context is defined as MisðnÞ ¼ 20 log10
~ Jh~ t hðnÞJ 2 ðdBÞ, ~ Jh t J
ð13Þ
2
where J J2 denotes the ‘2 norm.
3. A new distortion for the WL model
ð9Þ h0t,1
H ~ yðnÞ ¼ h~ t xðnÞ,
H ~ eðnÞ ¼ dðnÞh~ ðn1ÞxðnÞ
ð7Þ
where the superscripts H and n denote transposeconjugate and conjugate, respectively, and ht ¼ ht,1 þ jht,2 ,
It can be noticed that we are dealing now with a complex acoustic impulse response of length 2L, i.e., h~ t , whose complex input and output are, respectively, x(n) and d(n). From (7) or (10), we recognize the widely linear (WL) model for CRVs proposed in [5]. Thanks to the WL model, the two-input/two-output system with real random variables was converted to a single-input/singleoutput system with CRVs. This approach is in line with the duality principle [9]. Finally, the new goal is to estimate the system h~ t in ~ order to cancel the echo. Let hðnÞ be an adaptive filter (which is an estimate of h~ t ) and let
ð11Þ
As it was discussed in Section 1, it may be required to distort the input signals xL ðnÞ and xR ðnÞ, in order to have a unique solution to the SAEC problem. Reducing the coherence between these two signals will lead to a better estimate of the true acoustic impulse responses [3,4,10]. Obviously, this distortion should be performed without affecting too much the quality of the signals and the stereo effect. A simple but efficient nonlinear method uses positive and negative half-wave rectifiers on each channel respectively [4].
C. Stanciu et al. / Signal Processing 93 (2013) 511–516
513
Fig. 1. The WL model for SAEC.
where
The nonlinearly transformed signals become x0L ðnÞ ¼ xL ðnÞ þ ar
xL ðnÞ þ 9xL ðnÞ9 , 2
ð14Þ
x0R ðnÞ ¼ xR ðnÞ þ ar
xR ðnÞ9xR ðnÞ9 : 2
ð15Þ 9x0 ðnÞ9 ¼
ð16Þ
where yr ðnÞ [with tan yr ðnÞ ¼ xR ðnÞ=xL ðnÞ] and 9xðnÞ9 ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2L ðnÞ þx2R ðnÞ are the phase and module of x(n), respectively. In this formulation, we represent the stereo perception with yr ðnÞ and the quality of the stereo signals with 9xðnÞ9. A modification of yr ðnÞ only, will mostly affect the stereo effect of x(n); while a modification of 9xðnÞ9 will mostly affect the quality of the stereo signals. With the complex notation, (14) and (15) can be expressed as 0
x0 ðnÞ ¼ x0L ðnÞ þ jx0R ðnÞ ¼ ejyr ðnÞ 9x0 ðnÞ9,
x0R ðnÞ ar þ 2 þ ar sgn½xL ðnÞ ¼ tan yr ðnÞ x0L ðnÞ ar þ 2ar sgn½xR ðnÞ
ð18Þ
and
where ar is a parameter used to control the amount of nonlinearity. Experiments show that stereo perception is not affected by this method even with ar as large as 0.5. Also, the audible distortion introduced for speech is small because of the nature of the speech signal and psychoacoustic masking effects [11]. In the following, we propose a new distortion that fits well with the WL model. The complex input signal can be expressed as xðnÞ ¼ xL ðnÞ þ jxR ðnÞ ¼ ejyr ðnÞ 9xðnÞ9,
tan y0r ðnÞ ¼
ð17Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ð1þ ar þ 0:5a2r Þ9xðnÞ9 þ ar ð1 þ 0:5a2r Þ½xL ðnÞ9xL ðnÞ9xR ðnÞ9xR ðnÞ9:
From the two previous expressions, we observe that both the phase and the module are modified with a nonlinear distortion. Amazingly, even with a value of ar as large as 0.5, the stereo effect is not affected. This is likely due to the fact that the phase is not changed randomly, like in some other approaches, but according to the changes of the stereo signals. The specific SAEC problem of nonuniqueness happens because the signals xL ðnÞ and xR ðnÞ are linearly related. Let us consider the worst case scenario, where xL ðnÞ is equal to xR ðnÞ, i.e., xL ðnÞ ¼ xR ðnÞ,
8n:
ð19Þ
In this situation, (17) becomes qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 0 x0 ðnÞ ¼ 1 þ ar þ 0:5a2r ejyr ðnÞ 9xðnÞ9,
ð20Þ
where tan y0r ðnÞ ¼ ðar þ 1Þtan yr ðnÞ tan y0r ðnÞ ¼
1
ar þ 1
tan yr ðnÞ
if xL ðnÞ 4 0, if xL ðnÞ o0:
ð21Þ
ð22Þ
514
C. Stanciu et al. / Signal Processing 93 (2013) 511–516
We see that the module is not affected since ar is constant across time but y0r ðnÞ depends on xL ðnÞ ¼ xR ðnÞ. As a result, only the phase is changed. While xL ðnÞ ¼ xR ðnÞ, x0L ðnÞax0R ðnÞ and the transformed signals are no more linearly related. We know by experience that, even in this difficult scenario, the misalignment is improved with the nonlinear transformations. This suggests that we may not really need to modify the module of the complex signal, x(n). Therefore, we propose the new following transformations: x00L ðnÞ ¼ cos y0r ðnÞ9xðnÞ9,
ð23Þ
x00R ðnÞ ¼ sin y0r ðnÞ9xðnÞ9:
ð24Þ
Clearly, the phase is computed from the half-wave rectifiers [see (18)] while the module corresponds to the module of the original signals. As a consequence, with (23) and (24) we may have the same misalignment as with (14) and (15) but with the advantage of little distortion. So we can even increase the value of ar to have better performance as long as the stereo effect is not much affected. 4. The RLS-DCD algorithm Thanks to their fast convergence rate, RLS-based algorithms are very popular in SAEC. In the following, we slightly change the notation for convenience. Let us redefine the input signal vector (of length 2L) as ~ xðnÞ ¼ ½vT ðnÞ vT ðn1Þ vT ðnLþ 1ÞT ,
ð25Þ
T
where vðnÞ ¼ ½xðnÞ x ðnÞ . Thus, the new definitions of the true impulse response and the adaptive filter are n
Table 1 RLS-DCD algorithm. ~ Initialization: hð0Þ ¼ 0,rð0Þ ¼ 0,Rx~ ð0Þ ¼ dI2L For n¼ 1,2,y ~ x~ H ðnÞ Step 1: Rx~ ðnÞ ¼ lRx~ ðn1Þ þ xðnÞ H ~ Step 2: eðnÞ ¼ dðnÞh~ ðn1ÞxðnÞ n ~ Step 3: pðnÞ ¼ lrðn1Þ þ xðnÞe ðnÞ ~ ~ Step 4: R ~ ðnÞDhðnÞ ¼ pðnÞ ) DhðnÞ,rðnÞ x
(to be solved with DCD iterations) ~ ~ ~ Step 5: hðnÞ ¼ hðn1Þ þ DhðnÞ
nonstationary signals like speech, the FRLS algorithm is not stable and needs to be restarted when instability is detected (by using a specific variable), which is not always easy. In order to overcome this issue, we propose to use the RLS-DCD algorithm [7,8] in the context of the WL model for SAEC. Following the steps presented in [7], the normal equation (27) can be recursively solved as shown in Table 1, where the step 4 involves the DCD iterations [12]. In this table, rðnÞ is the so-called residual vector of the ~ solution [7], DhðnÞ is the increment of the filter weights (or the solution vector of the DCD algorithm), d denotes the initialization constant, and I2L is the 2L 2L identity matrix. The arithmetic complexity can be greatly reduced in the first step of the algorithm, taking into account that ~ the vector xðnÞ has the time shift property [see (25)] and the matrix Rx~ ðnÞ is Hermitian. Thus, only the first two columns of this matrix have to be computed, i.e., ~ Rð1Þ ðnÞ ¼ lRð1Þ ðn1Þ þxn ðnÞxðnÞ, x~ x~
ð28Þ
~ hðnÞ ¼ ½h0 ðnÞ h00 ðnÞ hL1 ðnÞ h0L1 ðnÞT ,
~ ðnÞ ¼ lRð2Þ ðn1Þ þxðnÞxðnÞ, Rð2Þ x~ x~
ð29Þ
where ht,l , h0t,l , hl(n), and h0l ðnÞ, with l ¼ 0,1, . . . ,L1, are the 0 elements of the vectors ht , h0t , hðnÞ, and h ðnÞ, respectively (i.e., the vectors are interleaved now instead of being concatenated). Obviously, these new definitions do not change the definition of d(n) [see (11)]. Next, we define the least-squares (LS) error criterion as [2]
since the lower-right ð2L2Þ ð2L2Þ block of Rx~ ðnÞ can be obtained by copying the ð2L2Þ ð2L2Þ upper-left block of Rx~ ðn1Þ. The DCD algorithm [12] is based on coordinate descent iterations with a power of two variable step-size. It does not need multiplications or divisions (these operations are simply replaced by bit-shifts), but only additions, so that it is well suited to hardware implementation. In our case, the auxiliary normal equation from step 4 is solved by using the complex-valued cyclic DCD algorithm [8], ~ where the solution vector DhðnÞ is updated in directions of Euclidian coordinates in a cyclic manner. There are two pre-defined parameters that have to be selected within the DCD algorithm. First, we need to fix M b , i.e., the number of bits used for a fixed-point representation of elements of the solution vector. The second parameter to be chosen is N u , i.e., the maximum number of allowed updates or ‘‘successful’’ iterations [7,8]. Due to the lack of space we do not further detail the DCD algorithm. Insightful analysis and detailed implementation of this algorithm can be found in [7,8], respectively. Most important, the arithmetic complexity of this algorithm is proportional to 2LN u (where Nu 5L) but using only additions, which is very attractive in practice.
h~ t ¼ ½ht,0 h0t,0
~ J½hðnÞ ¼
n X
ht,L1 h0t,L1 T ,
H
2
~ lni 9dðiÞh~ ðnÞxðiÞ9 ,
ð26Þ
i¼1
where l ð0 5 l o 1Þ is the forgetting factor, which influences the memory of the data in the different statistics ~ ~ estimates. The minimization of J½hðnÞ with respect to hðnÞ leads to the normal equation [2]: ~ Rx~ ðnÞhðnÞ ¼ pxd ð27Þ ~ ðnÞ, Pn Pn ni H ~ x~ ðiÞ and pxd where Rx~ ðnÞ ¼ i ¼ 1 l xðiÞ ~ ðnÞ ¼ i¼1 n ~ lni xðiÞd ðiÞ. The classical RLS algorithm [2] was developed in order to recursively solve (27); unfortunately, its arithmetic complexity is proportional to ð2LÞ2 , which is prohibitively high for SAEC implementations. The FRLS algorithm [2] further reduces the computational amount; the arithmetic complexity of this algorithm is proportional to 2L, which is realizable in practice. However, with
C. Stanciu et al. / Signal Processing 93 (2013) 511–516
5. Simulation results
0
RLS−DCD, Nu= 8 RLS−DCD, Nu= 16
MSE (dB)
−10 −15 −20 −25 −30 0
10
20
30 40 Time (seconds)
50
60
Fig. 3. MSE of the RLS algorithm and RLS-DCD algorithm using different values for N u and M b ¼ 16. The forgetting factor is l ¼ 11=ð14LÞ, with L¼ 512. The source signal is speech; preprocessing with (23) and (24) and ar ¼ 0:3.
0
without distortion positive and negative half−wave rectifiers new distortion
−2 −4 −6 −8 −10 −12 −14 −16 0
10
20 30 40 Time (seconds)
50
60
Fig. 4. Misalignment of the RLS-DCD algorithm (using N u ¼ 8 and M b ¼ 16) for different types of distortion with ar ¼ 0:3. The forgetting factor is l ¼ 11=ð14LÞ, with L ¼512. The source signal is speech.
0 RLS RLS−DCD, Nu= 4
−2
RLS RLS−DCD, Nu = 4
−5
Misalignment (dB)
Simulations are performed in the context of the proposed WL model for SAEC (Fig. 1). The acoustic impulse responses in the far-end location [i.e., g L ðnÞ and g R ðnÞ] have 2048 coefficients, while the length of the impulse responses in the near-end location [i.e., ht,LL ðnÞ, ht,RL ðnÞ, ht,LR ðnÞ, and ht,RR ðnÞ] is L¼512. The length of the adaptive ~ filter hðnÞ is 2L ¼ 1024 and the sampling rate is 8 kHz. The source signal in the far-end location is a speech sequence. All simulations are performed in the single-talk scenario, i.e., the absence of a near-end talker. In this case, the nearend signal v(n) consists only of the background noise. We can define the stereo echo-to-noise ratio (SENR) [which is equivalent to the signal-to-noise ratio (SNR)] as SENR ¼ s2y =s2v , where s2y ¼ E½9yðnÞ92 and s2v ¼ E½9vðnÞ92 are the variances of y(n) and v(n), respectively. In our simulations, the background noise in the near-end is an independent white Gaussian signal and its level is set such that SENR¼ 30 dB. Two performance measures are used, i.e., (a) the normalized misalignment (in dB) as defined in (13) and (b) the mean-square error (MSE) averaged over 256 points for the purpose of smoothing the results. In the first experiment, the classical RLS algorithm with the WL model [2] is compared with the RLS-DCD using different values of N u . The proposed distortion [see (23)–(24))] (with ar ¼ 0:3) is used to preprocess the farend microphone signals. The forgetting factor for all the algorithms is l ¼ 11=ð14LÞ and the initialization constant is d ¼ 0:001. For the initialization of the RLS-DCD algorithm we used M b ¼ 16. For this first simulation, the misalignment plots are given in Fig. 2 and the corresponding MSE curves are depicted in Fig. 3. As we can notice from Fig. 2, the convergence rate of the RLS-DCD algorithm is improved when Nu increases, but up to a certain value, i.e., N u ¼ 8. In this case, the RLS-DCD outperforms the classical RLS algorithm in terms of misalignment level.
Misalignment (dB)
515
RLS−DCD, Nu= 8 RLS−DCD, Nu= 16
−4 −6 −8 −10 −12 −14 −16 0
10
20
30
40
50
60
Time (seconds) Fig. 2. Misalignment of the RLS algorithm and RLS-DCD algorithm using different values for N u and M b ¼ 16. The forgetting factor is l ¼ 11=ð14LÞ, with L¼512. The source signal is speech; preprocessing with (23) and (24) and ar ¼ 0:3.
Besides, according to Fig. 3, the RLS-DCD algorithm also outperforms the classical RLS in terms of the MSE. In the second experiment, we compare the performance of the RLD-DCD algorithm (with Nu ¼ 8) using positive and negative half-wave rectifiers [see (14)–(15)] versus the new proposed distortion [see (23)–(24)]; also, the case without distortion is shown as a reference. The distortion parameter is set to ar ¼ 0:3. Other parameters are the same as in the previous simulation. It can be noticed from Fig. 4 that the misalignment is reduced by the new distortion. Also, according to Fig. 5, the new distortion leads to a better performance in terms of the MSE as compared to the positive and negative half-wave rectifiers method. To justify this behavior, we depicted in Fig. 6 the coherence function between the two channels (estimated using the Welch method). We can see that the new distortion leads to a weaker coherence between the
516
C. Stanciu et al. / Signal Processing 93 (2013) 511–516
variables by using the WL model. As a consequence, the four real-valued acoustic impulse responses are converted to one complex-valued impulse response. The main advantage of this approach is that instead of handling two (real) output signals separately, we only handle one (complex) output signal. In this framework, we proposed a new distortion that fits well with the WL model and leads to good performance for speech signals. Finally, the RLS-DCD algorithm was implemented in the context of the WL model for SAEC. Simulation results indicate that this algorithm represents a reliable choice for practical SAEC applications, since it achieves a fast convergence rate but also has good numerical features.
0 without distortion positive and negative half–wave rectifiers new distortion
−5
MSE (dB)
−10 −15 −20 −25 −30 0
10
20 30 40 Time (seconds)
50
60 Acknowledgments
Fig. 5. MSE of the RLS-DCD algorithm (using N u ¼ 8 and Mb ¼ 16) for different types of distortion with ar ¼ 0:3. The forgetting factor is l ¼ 11=ð14LÞ, with L ¼512. The source signal is speech.
1 0.9
The work of the first author was supported under the Grant POSDRU/107/1.5/S/76903. This work was also supported under the Grant UEFISCDI PN-II-RU-TE no. 7/5.08.2010 and the Grant UEFISCDI PN-II-ID-PCE2011-3-0097. References
0.8
Coherence
0.7 0.6 0.5 0.4 0.3 0.2 without distortion positive and negative half−wave rectifiers new distortion
0.1 0 0
0.5
1
1.5 2 2.5 Frequency (kHz)
3
3.5
4
Fig. 6. Magnitude squared coherence function for different types of distortion with ar ¼ 0:3. The source signal is a speech sequence.
channels compared to the positive and negative half-wave rectifiers. This difference is visible especially at higher frequencies. Several informal subjective tests were conducted in order to evaluate the speech quality in the case of the new distortion. The results indicate that the audio quality and the stereo effects are well preserved. 6. Conclusions In this paper, we have recasted the SAEC problem as a single-input/single-output system with complex random
[1] J. Benesty, T. Gaensler, D.R. Morgan, M.M. Sondhi, S.L. Gay, Advances in Network and Acoustic Echo Cancellation, SpringerVerlag, Berlin, Germany, 2001. ¨ [2] J. Benesty, C. Paleologu, T. Gansler, S. Ciochin˘a, A Perspective on Stereophonic Acoustic Echo Cancellation, Springer-Verlag, Berlin, Germany, 2011. [3] M.M. Sondhi, D.R. Morgan, J.L. Hall, Stereophonic acoustic echo cancellation—an overview of the fundamental problem, IEEE Signal Processing Letters 2 (8) (1995) 148–151. [4] J. Benesty, D.R. Morgan, M.M. Sondhi, A better understanding and an improved solution to the specific problems of stereophonic acoustic echo cancellation, IEEE Transactions on Speech and Audio Processing 6 (3) (1998) 156–165. [5] B. Picinbono, P. Chevalier, Widely linear estimation with complex data, IEEE Transactions on Signal Processing 43 (8) (1995) 2030–2033. ¨ [6] C. Stanciu, J. Benesty, C. Paleologu, T. Gansler, S. Ciochin˘a, A novel perspective on stereophonic acoustic echo cancellation, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2012, pp. 25–28. [7] Y.V. Zakharov, G.P. White, J. Liu, Low-complexity RLS algorithms using dichotomous coordinate descent iterations, IEEE Transactions on Signal Processing 56 (7) (2008) 3150–3161. [8] J. Liu, Y.V. Zakharov, B. Weaver, Architecture and FPGA design of dichotomous coordinate descent algorithms, IEEE Transactions on Circuits and Systems I: Regular Papers 56 (11) (2009) 2425–2438. [9] D.P. Mandic, S. Still, S.C. Douglas, Duality between widely linear and dual channel adaptive filtering, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2009, pp. 1729–1732. [10] S. Emura, Y. Haneda, A. Kataoka, S. Makino, Stereo echo cancellation algorithm using adaptive update on the basis of enhanced inputsignal vector, Signal Processing 86 (2006) 1157–1167. [11] B.C.J. Moore, An Introduction to the Psychology of Hearing, Academic Press, London, UK, 1989. [12] Y.V. Zakharov, T.C. Tozer, Multiplication-free iterative algorithm for LS problem, IEE Electronics Letters 40 (4) (2004) 567–569.