Recursive least-squares backpropagation ... - Semantic Scholar

Report 2 Downloads 105 Views
1472

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 6, NOVEMBER 2002

Recursive Least-Squares Backpropagation Algorithm for Stop-and-Go Decision-Directed Blind Equalization Shafayat Abrar, Azzedine Zerguine, Member, IEEE, and Maamar Bettayeb

Abstract—Stop-and-go decision-directed (S&G-DD) equalization is the most primitive blind equalization (BE) method for the cancelling of intersymbol-interference in data communication systems. Recently, this scheme has been applied to complex-valued multilayer feedforward neural network, giving robust results with a lower mean-square error at the expense of slow convergence. To overcome this problem, in this work, a fast converging recursive least squares (RLS)-based complex-valued backpropagation learning algorithm is derived for S&G-DD blind equalization. Simulation results show the effectiveness of the proposed algorithm in terms of initial convergence. Index Terms—Complex-valued backpropagation algorithm, recursive least squares (RLS) algorithm, stop-and-go decision-directed (S&G-DD) blind equalization (BE) algorithm.

I. INTRODUCTION

E

QUALIZATION techniques based on initial adjustment of the weights without a training sequence are known as self-recovering or blind schemes. There exists a class of stochastic-gradient iterative blind equalization (BE) schemes [1]–[3], which are based on least-mean-squares (LMS) algorithm and apply a memoryless nonlinearity to the output of a linear finite-impulse response (FIR) equalization filter for the purpose of generating the desired-response [4]. However, a linear FIR filter structure is not adequate to optimize such nonconvex functions because its decision region is convex [5]. Therefore, a BE scheme with a nonlinear structure is necessary. Multilayer feedforward neural networks provide a powerful device for approximating a nonlinear input–output mapping of a general nature. Recently, You and Hong [6] proposed four nonlinear BE schemes using complex-valued multilayer feedforward neural networks. One of their schemes was stop-and-go decision-directed algorithm based on neural networks (NNS&G-DDA). For this scheme, they derived a complex BP algorithm for the famous stop-and-go decision-directed algorithm (S&G-DDA), which was devised by Picchi

and Prati [3] and concluded that the NNS&G-DDA is the most robust when compared to the rest of their three proposed schemes. However, NNS&G-DDA’s weak point is its slow convergence. One of the possible reasons of slow convergence is that the BP training algorithm of multilayer perceptron is based on a generalized LMS algorithm and thus suffers from the same problem, i.e., slow convergence, as the LMS. From the viewpoint of adaptive filtering theory, it is well known that the recursive least squares (RLS) algorithm is typically an order of magnitude faster than the LMS algorithm [7]. Thus, to speed up the convergence of NNS&G-DDA, the weights in each layer can be adjusted using the RLS algorithm. In view of this, the prime objective of this paper is to focus on the gains, in terms of speed of convergence, that are brought about by the novel application of an RLS-based complex backpropagation (CBP) learning algorithm for S&G-DDA. As is well known, the case of the RLS algorithm carries a heavy computational load. However, as this is an implementation issue of paramount importance to practical applications, it warrants by itself a separate study. This paper is organized as follows: The data and neural equalizer model are described in Section II. S&G-DDA is described in Section III. The proposed scheme, RLS-based NNS&G-DDA, is described in Section IV. Simulation results are shown in Section V, while conclusions are drawn in Section VI. II. DATA AND EQUALIZER MODEL Consider a multipath digital communication system. When the channel is finite and time-invariant, the received discrete-time baseband signal can be expressed as

(1) Manuscript received October 16, 2000; revised July 9, 2001 and November 29, 2001. S. Abrar is with the Computer Engineering Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia (e-mail: [email protected]). A. Zerguine is with the Electrical Engineering Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia (e-mail: [email protected]). M. Bettayeb is with the Electrical and Electronics Engineering Department, University of Sharjah, Sharjah, United Arab Emirates (e-mail: [email protected]). Digital Object Identifier 10.1109/TNN.2002.804282

is a complex data sequence which is sent where , with symbols over a communication channel of length spaced seconds apart. The objective of blind equalization is given only the received to estimate the input sequence . The noise is zero-mean, uncorsignal sequence , and , related with is the variance of the noise. Throughout this paper, where

1045-9227/02$17.00 © 2002 IEEE

ABRAR et al.: RECURSIVE LEAST-SQUARES BACKPROPAGATION ALGORITHM

1473

Fig. 1. Model of ith neuron in the k th layer.

we will use the notations , , , and for transpose, Hermitian, complex-conjugate, and expectation of , respectively. layered feedforward neural equalizer Now, consider an neurons in the th layer, . The observahaving . We define the following notation: tion interval is of length (Subscript) delimiter for the real part. (Subscript) delimiter for the imaginary part. Activation function of a neuron. Derivative of the activation function of a neuron. . Vector of input signal of the neural networks. . Output signal of the th , in th layer. neuron, . Vector of output signals in the th layer. th input, for the th layer, , where for for for . Vector of input signals in th layer. Weight connecting th neuron, , in th , . layer with . Vector of weights of the th in th layer. , the linear output of the th neuron, in th layer. Fig. 1 depicts these notations clearly for the th neuron in the th layer. The nonlinear activation function used is , where is a constant that determines the degree of nonlinearity in the network. For our two layer neural equalizer, to denote the nonlinearity in we are using the notation input and output layers. neuron,

III. S&G-DDA The S&G-DDA, devised by Picchi and Prati [3], is a hybrid scheme in the sense that it uses both the error functions of de-

cision-directed algorithm (DDA) and the generalized Sato algorithm [2], [8]. The Sato algorithm has in general proven to be able to open the initially closed channel-eye more successfully than the DDA [2]. Also, it is proven that the DDA will not be able to converge, if the initial channel-eye is closed; however, it achieves faster convergence and a lower steady-state error with better tracking capability, once the eye is already open [9]. S&G-DDA uses a simple flag informing both the equalizer and the synchronizer whether the current output error with respect to the decided symbol is sufficiently reliable to be used; otherwise, adaptation is stopped for the current iteration. As a result, this algorithm provides effective blind convergence in the mean square error (MSE) sense. Now, let be the complex-valued output of a linear equalbe the decision made on the output . izer and will For quadrature amplitude modulate (QAM) signals, be the symbol that has the smallest Euclidean distance from . For square QAM constellations, e.g., 16, 64, and higher, can be computed as (2) where (3)

is equal to 1 if the argument is The signum function positive and 1 if it is negative. The decision-directed (DD) is expressed as error (4) The Sato-like error, which is used to check the reliability of the equalizer output, is defined as (5) is the sign of real and where imaginary parts of . Picchi and Prati found that if the adaptation is stopped for the small proportion of the instances when the DD and the Sato errors have different signs, the equalizer will eventually converge. The stop-and-go (S&G) flags for the

1474

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 6, NOVEMBER 2002

(a)

(b)

Fig. 3.

Learning curves for both schemes with 16-QAM signaling and channel

H (z ).

(c) Fig. 2. (a) and (c)  regions.

(d)

= 1 in shaded regions. (b) and (d) 

= 1 in shaded

real and imaginary parts of the equalizer output can be defined, respectively, as Go (6) Stop Go (7) Stop. These binary valued flags check the reliability of the real and the imaginary parts of the equalizer output and thus allow or prohibit the weight adaptation process for their corresponding “1” and “0” values. In a linear adaptive filter, these flags are incorporated in the weight adaptation rule as follows:

(8) is the regressor. where The constant used in (5) plays an important role in determining those regions where the equalized output is considered , reliable. For -QAM signal having a finite data set the value of that gives the maximum covering of those reliable regions can be computed as

for for

, ,

Fig. 4.

reliable regions for the imaginary part of the equalizer output ) for 16- and 32-QAM constellations, respectively. (i.e., A detailed simulation based analysis of these regions has been explored in [3]. S&G-DDA belongs to the class of stochastic gradient-based algorithms which exhibit the Bussgang property during steady-state [7]. The Bussgang process has the property that its auto-correlation function is equal to the cross-correlation between that process and the output of a zero-memory nonlinearity produced by that process, with both correlations being measured for the same lag. Accordingly, the equalizer output satisfies the following condition: (10)

(9)

In Fig. 2, the reliable regions (indicated as shaded zones) are shown for 16- and 32-QAM constellations; the value of is taken as three and five for 16- and 32-QAM signals, respectively. Fig. 2(a) and (c) show the reliable regions for the real part ) for 16- and 32-QAM conof the equalizer output (i.e., stellations, respectively. Similarly, Fig. 2(b) and (d) show the

Learning curves for both schemes with 16-QAM signaling and channel

H (z ).

is a zero-memory nonlinearity, defined as (Bayes estimator) for the case of the Bussgang algorithm for BE [7], and is considered as the desired response. Therefore, the weight adaptation rule can be easily obtained for a Bussgang-based BE scheme as

where

(11)

ABRAR et al.: RECURSIVE LEAST-SQUARES BACKPROPAGATION ALGORITHM

1475

Fig. 7.

H (z).

Learning curves for both schemes with 64-QAM signaling and channel

Fig. 6. Learning curves for both schemes with 32-QAM signaling and channel ( ).

Fig. 8.

Learning curves for both schemes with 64-QAM signaling and channel

Comparing (8) and (11), the nonlinear estimate in S&G algorithm, is given by

NNS&G-DDA is robust (showing stable convergence) with very low MSE in the steady state compared to the rest of their proposed schemes. However, NNS&G-DDA’s weak point is found to be its slow convergence. Thus, NNS&G-DDA provides a tradeoff between the initial convergence speed and the MSE in the steady state. Next, the NNS&G-DDA is derived using the RLS algorithm to obtain a fast converging equalizer. In the past decade, the use of the RLS algorithm for training feedforward multilayered perceptron neural networks has been investigated extensively [10], [11] for the real case only. In this section, we derive an RLS-based complex backpropagation algorithm for NNS&G-DDA scheme. be the performance measure which is the sum Let of squares of absolute values of the difference between the equalizer output and the nonlinear memoryless estimate . Since, in our case, there is a single neuron in the th (output) layer, and for the sake of generality, it is neurons in th layer, then is assumed that there are defined as

Fig. 5. Learning curves for both schemes with 32-QAM signaling and channel ( ).

H z

H z

, used (12)

Using (4), (12) becomes if if if if

, , (13)

IV. RLS-CBP ALGORITHM FOR S&G-DDA In the LMS algorithm, the cost function is a quadratic convex function of the tap weights and therefore has a well-defined minimum point. By contrast, the cost function of S&G-DDA, like other Bussgang algorithms, operating with a finite filter length is nonconvex; it may, therefore, have false minima. In general, the convergence of the Bussgang algorithm is not guaranteed. Based on the fact that neural network can be used to obtain nonconvex boundaries, You and Hong [6] proposed learning algorithms for S&G and other BE schemes using complex-valued multilayer feedforward neural networks. You and Hong showed that NNS&G-DDA performs better than its linear counterpart. They also showed that the

H (z).

(14)

1476

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 6, NOVEMBER 2002

where is a positive constant (called the forgetting factor) close to but less than one. The performance index is minimized by and setting taking its partial derivative with respect to it equal to zero, that is

complex numbers and , respectively. Using (20), (18) and (19) can be written, respectively, in a simplified form as

(21)

(22) ,

where and

(15)

. The complex quantity , denoting the reliability of the output of th neuron in th layer, is assumed to , be available; however, it will be replaced with , as defined in (6) and (7). Observe that (21) and (22) can be expressed, respectively, in the following recursive forms:

Equation (15) is set to the following (see Appendix I):

(23) (16) Equation (16) can be represented in the vector form as (see Appendix II for details) (17) where

(24) . Invoking Equation (17) gives , that is the matrix inversion lemma [7], the inverse of , can be found to be (Appendix III details the derivations):

(25) where

(26)

(18) Substituting the value of , we get

in

(27) (19)

Incorporating the learning rate

in (27) results in

Now, define a matrix operation for simplicity (28) (20) and are real parts of the complex numbers where and , respectively, and and are imaginary parts of the

The backpropagating error for the th neuron of the th layer is calculated as shown in (29) at the bottom of the page. It should . be noted that

ABRAR et al.: RECURSIVE LEAST-SQUARES BACKPROPAGATION ALGORITHM

1477

TABLE I SUMMARY OF RLSNNS&G-DDA AND ITS COMPUTATIONAL LOAD

Finally, this algorithm is named RLSNNS&G-DDA, and it is summarized in Table I along with its computational load. , The extra load is represented in the calculation of and for the RLSNNS&G-DDA as compared to NNS&G-DDA. This is detailed in Table I.

V. SIMULATION RESULTS The performance of the RLSNNS&G-DDA scheme is compared with that of the NNS&G-DDA scheme for 16-QAM and 32-QAM signals. The performance measure is the MSE.

, (29) and

1478

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 6, NOVEMBER 2002

Fig. 9. Scatter diagram before and after the equalization.

Two complex-valued communication channels, labeled [6] and [12, channel 1], are used in our simulation study. These channels have relatively flat frequency response. However, their binary eyes are closed and the simple decision-directed attempt (i.e., without S&G) to achieve equalizer training had failed. The signal-to-noise ratio (SNR) is set to 30 dB in all cases. All weights are initialized randomly with small complex numbers with standard deviation taken as 10 and 10 for schemes RLSNNS&G-DDA and NNS&G-DDA, respectively. The cen. tral weights in input and output layers are initialized as The value of forgetting factor is taken as 0.96 and all the matrices are initialized as 10 , where is the identity matrice. The slope-parameter , in the activation functions of the input- and output-layers is kept as 0 and 0.3, respectively, in both NNS&G-DDA and RLSNNS&G-DDA. is three, five, and seven for 16, 32, and 64-QAM signals, respectively. The is 15 and 23 for channels and , revalue of is nine for all cases. Learning rates for spectively, whereas both algorithms are shown on the figures. Note that for a fair comparison, the highest possible learning rates are used for both schemes. The convergence behavior of both schemes is evaluated using the MSE which is calculated by taking the ensemble average of 50 realizations of independent data and weight initializations. For 16-QAM signaling, the performance of the two schemes and are shown in Figs. 3 and 4, refor channels spectively. Similarly, Figs. 5 and 6 depict the performance of and , respectively, the two schemes for channels for 32-QAM signaling. It can be observed from these figures that the RLSNNS&G-DDA is outperforming the NNS&G-DDA scheme as far as the convergence is concerned and, at the same time, resulting in low excess error, which was the objective of this work. Similar behavior is obtained for 64-QAM signaling and , reas shown in Figs. 7 and 8 for channels spectively. Finally, Fig. 9 depicts the signal constellations before and for the after equalization of 16-QAM signaling in RLSNNS&G-DDA scheme, where each constellation contains 2000 data points. VI. CONCLUSION The results of our study can be summarized briefly as follows:

1) An RLS-based BP algorithm for complex-valued neural networks is derived. 2) The derived algorithm has been tested over two complex communication channels and implemented using a NNS&G-DDA BE scheme. Moreover, its performance is also found to be consistent under different signal constellations. 3) The use of the RLS-based BP for complex-valued neural networks has resulted in substantial improvements in term of convergence rate without affecting the steady-state excess error. 4) As pointed out in our introduction, the reduction in the computational load of the proposed RLS-based algorithm will be the focus of our next study where other fast versions of the RLS, such as the BSLS algorithm [13], will be tested by simulations. Moreover, reports on the convergence proof of the proposed algorithm will be the subject of further investigation. 5) If the speed gains obtained with the proposed RLS-based algorithm are to be preserved in practical applications, then a fast hardware implementation of the proposed algorithm, bound on either parallel processing or on a customized very large-scale integration VLSI chip, is to be used. As an attention to this purely hardware-based situation to this computational problem a possible hybrid approach would involve a combination of fast hardware and fast version of the RLS algorithm, such as the BSLS algorithm [13]. APPENDIX I It can be easily seen that (15) can be converted further, as shown in the first equation at the bottom of the next page, where

is the error propagated backward. Next, we have (30), shown at the bottom of the next page. Since , (30) simplifies to

ABRAR et al.: RECURSIVE LEAST-SQUARES BACKPROPAGATION ALGORITHM

1479

or The RHS expression is . In order to linearize the LHS of . The the above equation, it is multiplied and divided by LHS can be expressed as Finally, using (20), the above equation looks like the following:

LHS

APPENDIX II Using the definition in (20), (16) can be written as

(30)

1480

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 6, NOVEMBER 2002

(32)

(33)

ACKNOWLEDGMENT

Here, we are assuming that ; thus, is used , but this the current estimate . approximation is applied only on last over-braced With this assumption, it is now possible to apply matrix inverse , as shown in lemma to obtain recursive equation for Appendix 3.

APPENDIX III Recall (24)

(31) ; applying the matrix inversion Let lemma, we get (32), shown at the top of the page, where we have (33), shown at the top of the page. Cross-multiplying (33), we obtain

Then,

The authors acknowledge the support of KFUPM. The authors like to thank the anonymous reviewers for their constructive suggestions which has helped improve the paper. REFERENCES [1] D. N. Godard, “Self-recovering equalization and carrier tracking in twodimensional data communications systems,” IEEE Trans. Commun., vol. COM-28, pp. 1867–1875, Nov. 1980. [2] A. Benveniste and M. Goursat, “Blind equalizers,” IEEE Trans. Commun., vol. COM-32, pp. 871–883, Aug. 1984. [3] G. Picchi and G. Prati, “Blind equalization and carrier recovery using a ’Stop-and-Go’ decision-directed algorithm,” IEEE Trans. Commun., vol. COM-35, pp. 877–887, Aug. 1987. [4] J. G. Proakis, Digital Commun., 4 ed. New York: McGraw-Hill, 2001. [5] R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973. [6] C. W. You and D. S. Hong, “Nonlinear blind equalization schemes using complex-valued multilayer feedforward neural networks,” IEEE Trans. Neural Networks, vol. 9, pp. 1442–1455, Nov. 1998. [7] S. Haykin, Adaptive Filter Theory, 3rd ed. Englewood Cliffs, NJ: Prentice-Hall, 1996. [8] Y. Sato, “A method of self-recovering equalization for multilevel amplitude modulation systems,” IEEE Trans. Commun., vol. COM-23, pp. 679–682, June 1975. [9] O. Macchi and E. Eweda, “Convergence analysis of self-adaptive equalizers,” IEEE Trans. Inform. Theory, vol. IT-30, pp. 161–176, Mar. 1984. [10] M. R. Azimi-Sadjadi and R.-J. Liou, “Fast learning process of multilayer neural networks using recursive least square methods,” IEEE Trans. Signal Processing, vol. 40, pp. 446–450, Feb. 1992. [11] J. Bilski and L. Rutkowski, “A fast training algorithm for neural networks,” IEEE Trans. Circuits Syst. II, vol. 45, pp. 749–753, June 1998. [12] S. C. Bateman and S. Y. Ameen, “Comparison of algorithms for use in adaptive adjustment of digital data receivers,” Proc. Inst. Elect. Eng., pt. I, vol. 137, pp. 85–96, 1990. [13] X.-H. Yu and Z.-Y. He, “Efficient block implementation of exact sequential least-squares problems,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 392–399, Mar. 1988.

is obtained

(34)

Shafayat Abrar was born in Karachi, Pakistan, in 1972. He received the B.E. degree in electrical engineering from NED University, Karachi, in 1996 and the M.S. degree in electrical engineering from King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia, in May 2000. He is currently pursuing the Ph.D. degree in the Department of Computer Engineering, KFUPM. His current interests include digital signal processing, blind deconvolution, algorithms, and fault tolerance in neural networks. Mr. Abrar is the lifetime member of Pakistan Engineering Council.

ABRAR et al.: RECURSIVE LEAST-SQUARES BACKPROPAGATION ALGORITHM

1481

Azzedine Zerguine (S’93–M’96) received the B.Sc. degree from Case Western Reserve University, Cleveland, OH, in 1981, the M.Sc. degree from King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia, in 1990, and the Ph.D. degree from Loughborough University, Loughborough, U.K., in 1996, all in electrical engineering. From 1981 to 1987, he was working with different Algerian state-owned companies. From 1987 to 1990, he was a Research and Teaching Assistant in the Electrical Engineering Department, KFUPM. In 1990, he joined the Physics Department at KFUPM as a Lecturer, and then, from 1997 to 1998, as an Assistant Professor. In 1998, he joined the Department of Electrical Engineering at KFUPM, where he is presently an Assistant Professor working in the area of signal processing and communications. His research interests include signal processing for communications, adaptive filtering, neural networks, and interference cancellation.

Maamar Bettayeb received the B.S., M.S., and Ph.D. degrees in electrical engineering from University of Southern California, Los Angeles, in 1976, 1978, and 1981, respectively. He was a Research Engineer at the Bellaire Research Center at Shell Oil Development Company, Houston, TX, in the development of deconvolution seismic signal processing algorithms for the purpose of Gas and Oil exploration. From 1982 to 1988, he directed the Instrumentation and Control Laboratory of the High Commission for Research in Algeria, where he led various research and development projects in the field of modeling, simulation, and control design of large-scale energy systems, specifically, model reduction, identification, and decomposition, composite decentralized control, and simulation of computer-controlled systems with applications to nuclear, solar, and electric power systems. In 1988, he joined the Electrical Engineering Department at King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia. He has been a Professor at University of Sharjah, Sharjah, United Arab Emirates, since August 2000. He has been consulting for the Petrochemical Industries and has also been involved in various R&D projects in the areas of process control and signal processing applications to ultrasonic nondestructive testing for defect evaluation. His recent research interests include optimal control, rational approximation, signal processing, process control, artificial intelligence, and industrial applications.

H

Recommend Documents