IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005
379
A Recurrent Neural Network for Solving Nonlinear Convex Programs Subject to Linear Constraints Youshen Xia, Senior Member, IEEE, and Jun Wang, Senior Member, IEEE
Abstract—In this paper, we propose a recurrent neural network for solving nonlinear convex programming problems with linear constraints. The proposed neural network has a simpler structure and a lower complexity for implementation than the existing neural networks for solving such problems. It is shown here that the proposed neural network is stable in the sense of Lyapunov and globally convergent to an optimal solution within a finite time under the condition that the objective function is strictly convex. Compared with the existing convergence results, the present results do not require Lipschitz continuity condition on the objective function. Finally, examples are provided to show the applicability of the proposed neural network. Index Terms—Continuous methods, global convergence, linear constraints, recurrent neural networks, strictly convex programming.
I. INTRODUCTION
M
ANY engineering problems can be solved by transforming the original problems into linearly constrained convex optimization problems. For example, the least square problem with linear equality constraints can be viewed a basic framework which are widely used for system modeling and design in a variety of applications such as signal and image processing and pattern recognition [1]. In many applications, real-time solutions are usually imperative. One example of such applications in image processing is the solution to the image fusion problem in real-time wireless image transmission [2]. Compared with traditional numerical methods for constrained optimization, the neural network approach has several advantages in real-time applications. First, the structure of a neural network can be implemented effectively using VLSI and optical technologies. Second, neural networks can solve many optimization problems with time-varying parameters. Third, the numerical ordinary differential equation (ODE) techniques can be applied directly to the continuous-time neural network for solving constrained optimization problems effectively. Therefore, neural network methods for optimization have been received considerable attention [3]–[8]. Many continuous-time neural networks for constrained optimization problems have been developed [9]–[14], [16]–[20]. At present, Manuscript received March 28, 2002; revised August 30, 2004. This work was supported by the Hong Kong Research Grants Council under Grant CUHK4165/03E. Y. Xia is with the Department of Applied Mathematics, Nanjing University of Posts and Telecommunications, Nanjing 210003, China (e-mail:
[email protected]). J. Wang is with the Department of Automation and Computer-Aided Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong (e-mail:
[email protected]). Digital Object Identifier 10.1109/TNN.2004.841779
there exist several neural networks for solving nonlinear convex optimization problems with linear constraints [3], [9]–[12], [16]–[20]. Kennedy and Chua [3] developed a neural network for solving nonlinear programming problems, which fulfills the Kuhn–Tucker optimality condition. Because the Kennedy-Chua network contains a finite penalty parameter, it generates approximate solutions only and has an implementation problem when the penalty parameter is very large. To avoid using penalty parameters, some significant works have been done in recent years. A few primal-dual neural networks with two-layer and one-layer structure were developed [9], [12], [18] for solving linear and quadratic programming problems. These neural networks were proved to be globally convergent to an exact solution when objective function is convex. In addition, several projection neural networks for constrained optimization and related problems were developed [11], [19]. In [10], a two-layer neural network for solving nonlinear convex programming problems was studied. This network is shown to be globally convergent to an exact solution under a Lipschitz continuity condition of the objective function. To remove the Lipschitz continuity condition, Tao et al. studied [12] another two-layer neural network for solving nonlinear convex programming problems. Their network is shown to be globally convergent to an exact solution under a strictly convex condition of the objective function. In this paper, we propose a one-layer neural network for solving such nonlinear convex programming problems. Furthermore, we extend the proposed neural network for solving a class of monotone variational inequality with linear equality constraints. The proposed network is shown to be globally convergent to an exact solution within a finite time. Since the low complexity of neural networks is greatly significant from the viewpoint of computation and implementation, the proposed neural network is an attractive alternative of the the existing neural network for nonlinear optimization with linear equality constraints. Finally, simulation results and applied examples further confirm the effectiveness of the proposed neural network. The paper is organized as follows. In Section II, the nonlinear convex programming problem with standard linear constraints and its equivalent formulation are described. A neural network model is proposed to solve nonlinear convex programs with linear equality constraints. In Section III, the finite time convergence of the proposed neural network is proved. In Section IV, a neural network for solving monotone variational inequality with linear constraints is proposed and analyzed. In Section V, several examples are discussed to evaluate the effectiveness of the proposed neural network approach. Finally, Section VI gives the conclusion of this paper.
1045-9227/$20.00 © 2005 IEEE
380
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005
II. PROBLEM FORMULATION AND NEURAL DESIGN In this section, we describe the convex programming problem with standard linear constraints and its equivalent formulation. Then we present a neural network for solving this problem. Consider the following convex programming problem with linear constraints: minimize
subject to
piecewise linear actinetwork consists of integrators, processors for , connecvation functions, tion weights, and summers. Therefore, the proposed network in the origcomplexity depends only on the mapping inal problem. Recently, one two-layer neural network for solving (1) was developed in [12]. Its dynamical equation is described by
(1) (4)
where is continuously differentiable and convex from to , , and . When , where is positive–semidefinite, and , problem (1) becomes a standard convex quadratic programming problem minimize
subject to (2)
It is easy to see that the proposed neural network in (3) has a simpler structure than (4) since the neural network in (4) has . Furone additional nonlinear term: thermore, a comparison of the model complexity can be made between (3) and (4). For illustrate convenience, we consider the , and . case that the objective function is quadratic, The corresponding neural network in (4) becomes
Define a Lagrange function of (1) below
(5) The corresponding neural network in (3) is then given by
where is referred to as the Lagrange multiplier. According to the Karush–Kuhn–Tucker (KKT) condition [21], is a solution to (1) if and only if there exits such that satisfies the following condition:
This can be equivalently written as
Using the well-known projection theorem [21], we can obtain easily the following Lemma. is a solution to (1) if and only if there exits Lemma 1: such that satisfies
where is a positive constant, , . and Proof: See [21, pp. 267, Prop 5.1]. Based on the equivalent formulation in Lemma 1, we propose a recurrent neural network for solving (1), with its dynamical equation being given by (3) is a scaling constant. The dynamical (3) can be easily where realized in a recurrent neural network with a single-layer strucand can be imture as shown in Fig. 1, where plemented by using a piecewise linear activation function [7]. According to Fig. 1, the circuit realizing the proposed neural
(6) We now estimate their model complexity by using the total number of multiplications/divisions and additions/subtractions performed per iteration [8]. The state equation in (6) requires multiplications and additions/subtractions per iteration, while the state equation in (5) requires multiplications and additions/subtractions per iteration. Therefore, the present neural network in (3) has a lower model complexity for implementation than the existing network in (4). III. STABILITY ANALYSIS In this section, we study the global stability of the proposed neural network for solving problem (1). We first give one definition and one lemma for latter discussions. Definition 1: A continuous-time neural network is said to be globally convergent if, for any given initial point, the trajectory of the dynamic system converges to an equilibrium point. Lemma 2: Let be a closed convex set of . Then
and where denotes is defined by
norm and the projection operator
Proof: See [21, pp. 211, Prop 3.2]. The proposed neural network has the following basic property.
XIA AND WANG: RECURRENT NEURAL NETWORK FOR SOLVING NONLINEAR CONVEX PROGRAMS
381
Fig. 1. Block diagram of the recurrent neural network in (3).
Lemma 3: 1) For any initial point with , there exists a unique continuous solution for (3). Moreover, . 2) The equilibrium point of (3) solves (1). . 1) Proof: Without loss of generality, we assume with and let Let the initial point . It can be see that . Then is locally Lipwe see by Lemma 2 that schitz continuous, the right-hand term of (3) is also. According to the local existence and uniqueness theorem of ODEs [24], of (3) for . there exists a unique continuous solution is bounded and, thus, the local existence for We show that solution of (3) can be extended to the global existence. Next, note that
2) From the equivalent formulation in Lemma 1, it follows that the equilibrium point of (3) solves (1). We now establish our main result as follows. is strictly convex and twice Theorem 1: Assume that differentiable. Then the proposed neural network of (3) with the is stable in the Lyapunov sense and initial point , where globally convergent to the stationary point is the optimal solution of (1) and . . First, Proof: Without loss of generality we assume . We define the from Lemma 3 it follows that following Lyapunov function:
where
Then
since right two terms above are nonnegative.
and have
. According to [22. Th. 3.2], for any
we
382
where
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005
denotes the Jacobian matrix of
. Then
where
, the positive–definiteness of
Thus,
implies
. On the other side, note that implies
where
. It follows that . Thus, . So, if and only if . Therefore, the proposed neural network in (3) is globally convergent to the optimal solution of (1). Remark 1: Existing global convergence results [9], [11], is [13], [14], [19], [25] are based on the condition that is positive–definite. Here, Theorem 1 does quadratic or is positive–definite. not require the condition that Theorem 2: Assume that is positive–definite on . The convergence time of the proposed neural network in (3) is finite. , defined in Proof: Consider the Lyapunov function Theorem 1. By Theorem 1, we see that the proposed neural netof work in (3) is globally convergent to an equilibrium point (3), and
By the results given in [23], we know that
and Then
and
It follows that: Moreover, for any point
and
Thus, for any initial point , the solution trajectory is bounded. By the invariant set Theorem [24], we see that all solution trajectories of the proposed neural network in . (1) converge to a largest invariant set , where if and only if . Clearly, We now prove that , then . if . It can be seen that implies Let
(7)
satisfying
must be an equilibrium point of (3). In the term of the given condition, the initial point is not an equilibrium point of (3). or . Then Without loss of generality we assume that . Since is continuous, is so. Thus, there exists and such that on . Note that . Then, for all
It can be seen that when
where Thus,
reaches
for all
.
IV. EXTENSIONS since is positive–semidefine and Furthermore, (7) implies . Because
. and
In this section, we give two extensions of the proposed neural network in (3). First, consider the following nonlinear convex programming problem with the form minimize
subject to
(8)
XIA AND WANG: RECURRENT NEURAL NETWORK FOR SOLVING NONLINEAR CONVEX PROGRAMS
where
,
,
are defined in Section II, and . In this situations, the equivalent formulation in Lemma 1 still holds. That is
383
It follows that:
On one side, since such that where by
is bounded, there exists an . Then
is a projection operator which is defined and Thus .
Therefore, as one extension of the proposed neural network in (3), a neural network model for solving (8) can be given by
where
is an optimal solution of (10). Then
(9) Similar to the proof of Theorems 1 and 2, we can get the following stability result of the neural network in (9). is strictly convex and twice Theorem 3: Assume that differentiable for any . Then the neural network of (9) is stable in the Lyapunov sense and globally convergent to the , where is the optimal solution stationary point of (8). Moreover, the convergence speed of the proposed neural network in (9) is proportional to the design parameter . and , where It can be seen that when is a zero matrix, (8) becomes a nonlinear optimization problem with box constraints
where convexity of
where
where
. On the other side, from the uniform and Lemma 1 in [23, p. 478] it follows that:
. It follows:
. Then
Therefore minimize
subject to
(10)
The corresponding neural network for solving (10), which appeared in [13] and [14], is then given by (11) As for the stability of the neural network in (10), we have following result. is positive–definite on . Corollary 1: Assume that Then the neural network of (11) is globally exponentially stable to the optimal solution of (10). is positive–definite on , Proof: First, since is strictly monotone. Let be the trajectory of (11) with initial point , and let the Lyapunov function be
So the neural network in (11) is globally exponentially stable. Remark 2: Compared with existing condition [13], [14] that is uniformly positive–definite, Corollary 1 requires the condition that is positive–definite. Next, consider the following variational inequality problem: (12) where
is continuously differentiable, . By the KKT condition for (12) we see that is a solution of (12) if and only if there exists such is a solution the following problem: that (13) where
Similar to the analysis of Theorem 1 we have and
Then
, where is a compact level set. Since is positive–definite on , there exists a positive constant such that
and
According to the projection theorem [21], we see that (13) can be reformulated as
Thus, as another extension of the proposed neural network in (3) we have the following neural network model for solving (12) (14)
384
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005
where is a scaling constant. From the analysis of Theorem 1 we easily get the global convergence result on the neural network in (14). Theorem 4: Assume that is positive–definite on . Then the trajectory of the neural network in (14) with initial point converge globally to an equilibrium point , where is a solution of (12). Moreover, the convergence time of the neural network in (14) is finite. V. NUMERICAL EXAMPLES First, we give one numerical example to demonstrate the effectiveness of obtained results. Example 1: Consider the convex programming problem (8), where minimize where
subject to ,
(15)
Fig. 2. Transient behavior of u(t) = (x(t); y (t)) based on the proposed continuous-time neural network of (12) with zero initial point in Example 1.
,
,
and This problem has only one solution
. Note that
and Fig. 3. Transient behavior of the proposed neural network in (16) Example 2.
is positive–definite on . Theorem 3 guarantees that the neural network in (9) is globally convergent to . We use the neural network in (9) to solve this problem. All simulation results show that the neural network in (9) is globally convergent to . For and . Fig. 2 displays the trajectories example, let of (9) with ten random initial points. Example 2: Consider the variational inequality problem (12), where
, and
. This problem has a unique solution [11]. We use the neural network in (14) to solve the previous problem. All simulation results show that the neural network in (14) is always globally convergent. For exand , and let an initial point be zero. The ample, let . Fig. 3 displays obtained solution that the transient behavior of the neural network of (14) with 10 random initial points.
The proposed neural network in (3) has many real-time applications. For example, the proposed neural network in (3) can be applied to support vector machines for classification and regression [22] and to robot motion control in real time [17], [18]. Here, we apply the proposed neural network in (3) to image fusion. Image fusion is to increase the useful information content of images by using a fusion technique such that more discriminating features can be extracted for further processing. As a result, image fusion techniques have been applied to many fields such as remote sensing, medical imaging, machine vision, and etc. grayConsider a two-dimensional (2-D) image with . Let level, whose amplitude is denoted by
Then imagery information can be rewritten as a set of data along . Assume that the images are collected by time different image sensors , and each image is partly
XIA AND WANG: RECURRENT NEURAL NETWORK FOR SOLVING NONLINEAR CONVEX PROGRAMS
385
of Camera are collected by three different image sensors, where the signal-to-noise ratio (SNR) be 10 dB. All simulation results show that the proposed neural network in (18) is globally convergent to an optimal fusion solution. Fig. 4 displays the fused images using the proposed neural network in (18). It is easy to see from this figure that the fused image has a much clearer appearance than those images collected by local sensors. VI. CONCLUDING REMARKS We have proposed a recurrent neural network for solving nonlinear convex programming problems with general linear constraints. It is shown here that the proposed neural network is stable in the sense of Lyapunov and globally convergent to an optimal solution under strictly convex condition of the objective function. Compared with the existing neural network for solving such problems, the proposed neural network has a simple single-layer structure and is amenable to parallel implementation. Moreover, the proposed neural network does not require the Lipschitz continuity condition of the objective function. Finally, examples and applications are provided to show the performance of the proposed neural network. Fig. 4. Camera image fusion using the proposed neural network in Example 3. (a) Noise image. (b)–(d) Fused image with n = 10, 20, 30.
distorted by noise. According to the result discussed in [20], the considered image fusion problem can formulated as a deterministic quadratic programming problem minimize subject to where
,
is a sample variance matrix and fused vector. An optimal fusion solution
(16) , and
is called the is then given by
We now apply the proposed neural network in (3) to find the optimal solution of (17). The corresponding neural network for solving (17) become (17) where . The following convergence result is a direct result of Theorem 1. is positive–definite. Then Corollary 2: Assume that the the neural network in (18) is globally convergent to an optimal fused solution. Example 3: Consider image fusion using the proposed neural network. We test the global convergence of the neural network in (18) on the image Camera. Both are 8-b grey-level images with 256 by 256 pixels. We first assume that the images
REFERENCES [1] N. Kalouptisidis, Signal Processing Systems, Theory and Design. New York: Wiley, 1997. [2] H. Li, B. S. Manjunath, and S. K. Mitra, “Multisensor image fusion using the wavelet transform,” Graph. Models Image Process., vol. 57, no. 3, pp. 235–245, 1995. [3] M. P. Kennedy and L. O. Chua, “Neural networks for nonlinear programming,” IEEE Trans. Circuits Syst., vol. CAS–35, no. 5, pp. 554–562, May 1988. [4] A. Rodríguez-Vázquez, R. Domínguez-Castro, A. Rueda, J. L. Huertas, and E. Sánchez-Sinencio, “Nonlinear switched-capacitor neural networks for optimization problems,” IEEE Trans. Circuits Syst., vol. 37, no. 3, pp. 384–397, Mar. 1990. [5] C. Y. Maa and M. A. Shanblatt, “Linear and quadratic programming neural network analysis,” IEEE Trans. Neural Netw., vol. 3, no. 4, pp. 580–594, Jul. 1992. [6] A. Bouzerdoum and T. R. Pattison, “Neural network for quadratic optimization with bound constraints,” IEEE Trans. Neural Netw., vol. 4, no. 2, pp. 293–304, Mar. 1993. [7] A. Cichocki and R. Unbehauen, Neural Networks for Optimization and Signal Processing. New York: Wiley, 1993. [8] S. H. Zak, V. Upatising, and S. Hui, “Solving linear programming problems with neural networks: A comparative study,” IEEE Trans. Neural Netw., vol. 6, no. 1, pp. 94–104, Jan. 1995. [9] Y. S. Xia, “A new neural network for solving linear and quadratic programming problems,” IEEE Trans. Neural Netw., vol. 7, no. 6, pp. 1544–1547, Nov. 1996. [10] Y.S. Xia and J. Wang, “A general methodology for designing globally convergent optimization neural networks,” IEEE Trans. Neural Netw., vol. 9, no. 6, pp. 1331–1343, Nov. 1998. [11] Y. S. Xia, H. Leung, and J. Wang, “A projection neural network and its application to constrained optimization problems,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 49, no. 4, pp. 447–458, Apr. 2002. [12] Q. Tao, J. D. Cao, M. S. Xue, and H. Qiao, “A high performance neural network for solving nonlinear programming problems with hybrid constraints,” Phys. Lett. A, vol. 288, no. 2, pp. 88–94, 2001. [13] X. B. Liang and J. Wang, “A recurrent neural network for nonlinear optimization with a continuously differentiable objective function and bound constraints,” IEEE Trans. Neural Netw., vol. 11, no. 6, pp. 1251–1262, Nov. 2000. [14] Y. S. Xia and J. Wang, “On the stability of globally projected dynamic systems,” J. Optim. Theory Appl., vol. 106, no. 1, pp. 129–150, Jul. 2000. [15] C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn., vol. 20, pp. 273–297, 1995.
386
[16] Y.S. Xia and J. Wang, “A one-layer recurrent neural network for support vector machine learning,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 34, no. 2, pp. 1261–1269, Apr. 2004. [17] Y. Zhang, J. Wang, and Y. S. Xia, “A dual neural network for redundancy resolution of kinematically redundant manipulators subject to joint limits and joint velocity limits,” IEEE Trans. Neural Netw., vol. 14, no. 3, pp. 658–667, May 2003. [18] J. Wang, Q. Hu, and D. Jiang, “A Lagrangian neural network for kinematic control of redundant robot manipulators,” IEEE Trans. Neural Netw., vol. 10, no. 5, pp. 1123–1132, Sep. 1999. [19] Y.S. Xia and J. Wang, “A general projection neural network for solving monotone variational inequality and related optimization problems,” IEEE Trans. Neural Netw., vol. 15, no. 2, pp. 318–328, Mar. 2004. [20] Y. S. Xia, H. Leung, and E. Bossé, “Neural data fusion algorithms based on a linearly constrained least square method,” IEEE Trans. Neural Netw., vol. 13, no. 2, pp. 320–329, Mar. 2002. [21] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods. Englewood Cliffs, NJ: Prentice-Hall, 1989. [22] M. Fukushima, “Equivalent differentiable optimization problems and descent methods for asymmetric variational inequality problems,” Math. Program., vol. 53, pp. 99–110, 1992. [23] J. S. Pang, “A posteriori error bounds for the linearly-constrained variational inequality problem,” Math. Oper. Res., vol. 12, pp. 474–484, 1987. [24] R. M. Golden, Mathematical Methods for Neural Network Analysis and Design. Cambidge, MA: MIT Press, 1996. [25] X. Gao, L. Z. Liao, and W. Xue, “A neural network for a class of convex quadratic minimax problems with constraints,” IEEE Trans. Neural Netw., vol. 15, no. 3, pp. 622–628, May 2004.
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005
Youshen Xia (M’96–SM’01) received the B.S. and M.S. degrees, both in computational mathematics and applied software, from Nanjing University, China, in 1982 and 1989, respectively, and the Ph.D. degree from the Department of Automation and Computer-Aided Engineering, The Chinese University of Hong Kong, Nanjing, in 2000. His present research interests include design and analysis of recurrent neural networks for constrained optimization and neural network applications to data mining, data fusion, and signal and image processing.
Jun Wang (S’89–M’90–SM’93) received the B.S. degree in electrical engineering and the M.S. degree in systems engineering from Dalian University of Technology, Dalian, China, and the Ph.D. degree in systems engineering from Case Western Reserve University, Cleveland, OH. He is currently a Professor of automation and computer-aided engineering at the Chinese University of Hong Kong, Shatin. Prior to coming to Hong Kong in 1995, he was an Associate Professor at the University of North Dakota, Grand Forks. His current research interests include neural networks and their engineering applications. Dr. Wang is an Associate Editor of the IEEE TRANSACTIONS ON NEURAL NETWORKS and the IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: PARTS B AND C.