Primal and Dual Assignment Networks

Report 3 Downloads 26 Views
784

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 8, NO. 3, MAY 1997

Primal and Dual Assignment Networks Jun Wang

Abstract—This paper presents two recurrent neural networks for solving the assignment problem. Simplifying the architecture of a recurrent neural network based on the primal assignment problem, the first recurrent neural network, called the primal assignment network, has less complex connectivity than its predecessor. The second recurrent neural network, called the dual assignment network, based on the dual assignment problem, is even simpler in architecture than the primal assignment network. The primal and dual assignment networks are guaranteed to make optimal assignment. The applications of the primal and dual assignment networks for sorting and shortest-path routing are discussed. The performance and operating characteristics of the dual assignment network are demonstrated by means of illustrative examples. Index Terms— Assignment problem, optimization, recurrent neural networks.

Compared with their predecessor, the present recurrent neural networks (especially DAN) are less complex in architecture. Comparable to other neural-network approaches to the assignment problem, PAN and DAN are guaranteed for optimal assignment. The rest of this paper is organized as follows. In Section II, the formulations of the primal and dual assignment problems are introduced. In Sections III and IV, the dynamical equations, architectures, and design principles of PAN and DAN are discussed, respectively. In Sections V and VI, the use of the primal and dual networks for sorting and shortest-path routing are described, respectively, and the functional capabilities and operating characteristics of DAN are demonstrated via two illustrative examples. Finally, Section VII concludes this paper with final remarks.

I. INTRODUCTION

II. PROBLEM FORMULATION

T

HE assignment problem (also known as the linear assignment problem and the matching problem) is concerned with assigning a number of entities to a number of positions and minimizing a linear cost function. The assignment problem is a classical combinatorial optimization problem arising in numerous planning and designing contexts. The applications of the assignment problem include, but are not limited to, pattern classification, job assignment, facility layout, production scheduling, and printed circuit board design. Various solution procedures for solving the assignment problem have been investigated over decades. Besides the classical methods such as the simplex method and the Hungarian method, many new and improved methods have been developed [1], [2]. For time-varying and/or large-scale assignment problems such as weapon-target assignment, the existing algorithms may not be effective and efficient due to the time-varying problem parameters and stringent processing requirement, and real-time solution methods are more desirable. Since Hopfield and Tank’s seminal work [3], neural networks for solving optimization problems have been a major area in neural-network research, e.g., [4]–[10]. In particular, various neural networks have been developed for solving the assignment problem [11]–[17]. For example, a recurrent neural network, called the deterministic annealing network [9] realized in an analog circuit [13], is shown to be capable of making optimal assignment in real time. In the present paper, two recurrent neural networks, called, respectively, the primal and dual assignment network (PAN and DAN), for solving the assignment problem are presented. Manuscript received April 29, 1996; revised November 25, 1996. The author is with the Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. Publisher Item Identifier S 1045-9227(97)02761-6.

A. Primal Assignment Problem The assignment problem can be formulated as the following zero–one integer linear program: minimize

(1)

subject to

(2) (3) (4)

and are, respectively, the cost coefficient and where decision variable associated with assigning entity to position . In general, a cost coefficient can be positive representing a loss or negative representing a gain. The decision variable is defined such that if and only if entity is assigned to position . The objective function (1) to be minimized is the total cost for the assignment. Constraint (2) ensures that exactly one entity is assigned to each position; i.e., each column of has only one decision variable being one. Constraint (3) ensures that each entity is assigned to exactly one position; i.e., each row of has only one decision variable being one. Constraint (4) is the zero–one integrality constraint on decision variables. It is well known from the optimal solution point of view that if the optimal solution is unique, then the assignment problem is equivalent to a linear programming problem by replacing the zero–one integrality constraints (4) with nonnegativity constraints, due to the total unimodularity property [1], [2]

1045–9227/97$10.00  1997 IEEE

(5)

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 8, NO. 3, MAY 1997

785

The resulting linear program is called the PAN hereafter. PAN contains decision variables, equality constraints, and nonnegativity constraints. (8)

B. Dual Assignment Problem Since the number of equality constraints is less than the number of decision variables for , it is more desirable to formulate the dual of the primal assignment problem. Based on the primal assignment problem, the dual assignment problem can be formulated as follows:

, and are positive scaling constants. The decaying where term in (8), , is called the temperature parameter. Its role is to balance the effects of cost minimization and constraint satisfaction as explained at length in [8] and [9].

maximize

(6)

B. Dynamical Equation

subject to

(7)

Let , the state equation and the activation function of the recurrent neural network presented in [13] are described, respectively, as follows: for

and denote the dual decision variables. The where number of decision variables and inequality constraints in the dual assignment problem is and , respectively. According to the duality theorem in optimization theory [1], [2], the value of the objective function at its maximum is equal to the total cost of the primal assignment problem at its minimum [1], [2], i.e., no duality gap. Because a solution to the dual assignment problem does not directly show an assignment, the decoding from the dual optimal solution to the primal optimal solution is needed for solving the assignment problem with dual decision variables. According to the complementary slackness theorem [2], given the feasible solutions to the primal assignment problem and and to the dual assignment problem, respectively, the solutions are optimal if and only if for the following holds: 1) implies ; 2) is implied by . The complementary slackness theorem can be used to decode the optimal solution to the primal assignment problem from that to the dual assignment problem. Because the assignment problem is formulated as linear programs, it can be solved by the neural networks proposed for solving linear programming problems in [3]–[10]. In [8] and [9], a recurrent neural network, called the deterministic annealing network, is presented and demonstrated to be capable of solving linear programming problems. PAN and DAN discussed in the ensuing sections are tailored from the deterministic annealing network [8], [9]. III. PRIMAL ASSIGNMENT NETWORK A. Energy Function To simplify the notation, let the symbols for the decision variables of the primal and dual assignment problems also denote hereafter the activation states of PAN and DAN. An energy function based on the primal assignment problem can be defined as follows:

(9) (10) where denotes the net input to neuron is a nonnegative and monotone nondecreasing activation function defined as and . To reduce the connectivity of the resulting network architecture, the dynamical equation (9) can be rewritten by using instrumental variables and defined in (12) and (13) herein: for (11)

(12) (13) The multilayer recurrent neural network for solving the primal assignment problem described in (10)–(13) is called PAN hereafter. C. Network Architecture PAN consists of neurons arranged spatially in three layers: an output layer and two hidden layers. The output layer consists of an two-dimensional array of output neurons represented by . Each of the two hidden layers consists of an -vector of hidden neurons representing instrumental variables or . The first two terms in the right-hand side of (11) define the connectivity from the hidden neurons to the output neurons. The third term in the right-hand side of (11) defines a decaying external input to the output layer. Similarly, the first term in the right-hand side of (12) and (13) defines the connectivity from the output neurons to the

786

hidden neurons. The second term in the right-hand side (12) and (13) defines a constant negative unity input (bias) to the hidden layers. Moreover, (11)–(13) also show the connectivity of PAN. 1) There exist inhibitory connections with weight of from and in the hidden layers to in the output layer . 2) There exist excitatory connections with unity weight from in the output layer to in the hidden layer . 3) There exist excitatory connections with unity weight from in the output layer to in the hidden layer . 4) There is no lateral connection among neurons in either the output layer nor in the hidden layers. The spatial complexity of PAN is characterized by neurons and connections. Specifically, there are unidirectional connections in PAN compared with the connections in the single-layer recurrent neural network [13], a reduction of connections for . Fig. 1 illustrates the architecture of PAN.

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 8, NO. 3, MAY 1997

serves as a time constant for the decaying bias to reinforce the effect of cost minimization. The time constant has to be sufficiently large to sustain cost minimization and ensure solution optimality. Since the transient time of PAN depends on the reciprocals of the sensitivity parameter and the minimum absolute value of the nonzero eigenvalues of the connection weight matrix , a lower bound on the value of is ; i.e., if the unipolar sigmoid activation function is used, as discussed in [8]. Based on a similar analysis, a design rule is that if the Heaviside activation function is used. The role of is to balance the effects of constraint satisfaction and cost . minimization. Let A design rule is to select such that . IV. DUAL ASSIGNMENT NETWORK A. Energy Function Similarly to that in PAN, an energy function based on the dual assignment problem can be formulated as

D. Activation Function The role of the activation function is twofold: to enforce the nonnegative constraint on as described in (5) and to scale the sensitivity of the activation of . The basic requirements for the activation function of PAN are nonnegativity and nondecreasing monotonicity; i.e., and for . The unipolar sigmoid function, , is a good candidate of the activation function, where is the supremum of and is a positive scaling constant. Another simple activation function (Heaviside function) used in [13] can be defined as for and otherwise, where is the slope of the activation function in the positive half-space. Similar to the sigmoid activation function, the design parameter determines activation sensitivity.

(14) is a nonnegative and nondecreasing actiwhere vation function defined as if and if . Similar to that in PAN, the term is a temperature parameter. B. Dynamical Equations Let

and , the dynamical equations of DAN are

as follows: for

(15)

E. Design Parameters If the activation states are cascaded to form a vector, then the connection weight matrix of PAN is essentially defined as an sparse matrix with nonzero elements of weight only. As discussed in [8] and [9], the convergence rate of the deterministic annealing network increases as the minimum absolute value of the nonzero eigenvalue of the connection weight matrix increases. It has been shown that the nonzero eigenvalues of the connection weight matrix of PAN are equal to either or [18]. Therefore, a large value of should be used in a design process to expedite the convergence. Furthermore, since the nonzero eigenvalues are directly proportional to , the average convergence time of PAN with given design parameters decreases as increases. This feature has been demonstrated for the predecessor of PAN in [18]. From (11), it is easy to see that the convergence rate of PAN depends also on the parameter . The parameter

(16) The solution from DAN can be easily decoded into that from PAN by using the complementary slackness theorem as follows: (17) where , or

is the output function defined as otherwise.

if

C. Network Architectures DAN consists of neurons representing and arranged spatially in two layers. The dynamical equations (15)

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 8, NO. 3, MAY 1997

787

is the transpose of that in the primal problem, the nonzero eigenvalues of the connection weight matrix in DAN are also either or . Large value of can expedite the convergence of DAN as well. Similar to in PAN, the role of is to balance the effects of constraint satisfaction and objective maximization. It is usually set . Similar to that in PAN, the role of the activation function in DAN is to enforce the inequality constraints (7) and scale activation sensitivity. The Heaviside activation function discussed in the preceding section can be used in DAN. V. SORTING APPLICATION

Fig. 1. Architecture of PAN.

Fig. 2. Architecture of DAN.

and (16) of DAN show that there is an inhibitory connections with weight of from every pair of and . That is, the number of connections is for from to and another from to . The dynamical equations also show that all the neurons share . Fig. 2 the common decaying external input illustrates the architecture of DAN. D. Design Parameters Because the constraint coefficient matrix in a dual problem

Sorting is a process of arranging items in order, used widely in numerous application areas such as database management, network communication, digital signal processing, and very large scale integration (VLSI) design. As a fundamental operation in data processing, sorting operations account for over 25% of processing time. In [18], the sorting problem is formulated a primal assignment problem and solved using a single-layer deterministic annealing network (the predecessor of PAN). The decision variable is defined as that if item with numerical key is in the th position of the sorted list. The cost coefficients of the assignment problem for sorting are defined as where and denote, respectively, the numerical key of the th item to be sorted and the nonzero weighting parameter for the th position in the desired sequence (sorted list). Example 1: Consider the sorting problem used in [18] (Example 1): rank a set of ten items in an ascending order. Let for . Accordingly, the cost coefficient matrix can be defined in the matrix shown at the bottom of the page. Let s, , and the Heaviside activation function be used. The result of the numerical simulation shows that the steady state of DAN is and . Fig. 3 depicts the transient behavior of the activation states of the simulated DAN. It shows that the simulated DAN takes about 0.4 s to converge. Using (17), the optimal solution to the corresponding primal assignment problem can be easily

788

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 8, NO. 3, MAY 1997

COORDINATES

OF THE

NODES

TABLE I SHORTEST PATH PROBLEM

OF THE

IN

EXAMPLE 2

interpreted as follows:

The order representation can be accordingly decoded as a sorted sequence ; i.e., Obviously, the solution from DAN in this sorting example represents a correct order. VI. ROUTING APPLICATION Given a direct graph where is a set of nodes (vertices) and is an ordered set of arcs (edges). A fixed distance parameter is associated with each arc from nodes to in the network . The shortest-path routing problem is concerned with finding the shortest path from a specified origin to a specified destination in a given network minimizing the total cost associated with the path. The shortest-path routing problem is a classical combinatorial optimization problem having widespread applications in a variety of settings. Three examples of shortest-path routing applications are the vehicle routing in transportation systems, traffic routing in telecommunication networks, and motion planning in robotic systems. Recently, several recurrent neural networks have been developed for shortest-path routing, e.g., [19]–[21]. In [22], the relationship between the assignment problem and the shortest-path problem is explicitly described. It shows that an -node shortest path problem without any negative cycle can be considered as an -order primal assignment problem with decision variables. A negative cycle is characterized by the negative sum of cost coefficients around a closed circuit in ; i.e., , such that . The cost coefficient matrix of the assignment problem for shortest-path routing is defined as and for . The optimal solution to the corresponding primal assignment problem with ) to resetting the main diagonal elements (i.e., let remove degenerate cycles is the optimal solution to the shortest path problem. Similarly, the shortest path problem can also be formulated as an -order dual assignment problem. The necessary condition for the state variables of the assignment networks to converge to a feasible solution is the absence of negative cycles in [22].

(a)

(b) Fig. 3. Transient states of the dual assignmentnetwork in Example 1.

Example 2: Consider the same shortest path problem with ten nodes (Example 1) in [21] where the origin and destination nodes are, respectively, nodes 1 and 10. The coordinates of the nodes are listed in Table I. The existing arcs between nodes are: . The Euclidean distances are used as the cost coefficients. ; i.e., The shortest path of this problem is . The total cost of the shortest path is , has 1.149 896. The second shortest path,

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 8, NO. 3, MAY 1997

789

last row

Obviously, the decoded solution from DAN represents the shortest path. The simulation result in this example also indicates that DAN can also distinguish the shortest path and the very close second shortest path. (a)

VII. CONCLUDING REMARKS

(b) Fig. 4. Transient states of the dual assignment network in Example 2.

a very close total cost of 1.172 729. Note that in this example since the number of nodes in the given network , is ten. Let and the Heaviside activation function be used, the numerical simulation result shows that the steady state of DAN is

, and the resulting maximum value of the objective function of the dual assignment problem is 0.149 897. Fig. 4 illustrates the transient behavior of the activation states of the simulated DAN. The simulated DAN takes also about 0.4 s to converge. The optimal solution to the primal shortest-path problem can be readily decoded from the optimal dual solution by ignoring the elements on the main diagonal, the first column, and the

In this paper, PAN and DAN have been proposed. PAN and DAN have been shown to be capable of making optimal assignments in real time. Compared with their predecessor which consists of neurons and connections, the PAN is composed of neurons and connections whereas the dual assignment network is of neurons and connections, thus are more routable in hardware realization. One salient attribute of the PAN and DAN is the independency of the connection weights upon specific problems. Specifically, only the external inputs (cost coefficients) are different for different assignment problems. Furthermore, the PAN and DAN can be modulized with a large number of neurons. In a specific application, the unused neurons can be disabled by assigning very large cost coefficients to penalize the selection of the corresponding decision variables. These desirable features facilitate the hardware implementation of the PAN and DAN. Since the minimum absolute value of the nonzero eigenvalues of the PAN and DAN is directly proportional to the size of the assignment problem, the convergence rate of the PAN and DAN to be implemented in hardware ultimately is statistically proportional to the size of the assignment problem. Further investigations have been aimed at the analogic implementation, extension for solving the quadratic assignment and general assignment problems, and specific applications of the PAN and DAN in telecommunication and intelligent transportation systems. The PAN and DAN implemented in a VLSI circuit will serve as coprocessors for onboard planning in dynamic decision environments for solving the large-scale assignment problems in real time. REFERENCES [1] M. S. Bazaraa, J. J. Jarvis, and H. D. Sherali, Linear Programming and Network Flows, 2nd ed. New York: Wiley, 1990. [2] D. G. Luenberger, Linear and Nonlinear Programming, 2nd ed. Reading, MA: Addison-Wesley, 1984. [3] J. J. Hopfield and D. W. Tank, “Neural computation of decisions in optimization problems,” Biol. Cybern., vol. 52, pp. 141–152, 1985. [4] D. W. Tank and J. J. Hopfield, “Simple neural optimization networks: An A/D converter, signal decision circuit, and a linear programming circuit,” IEEE Trans. Circuits Syst., vol. CAS-33, pp. 533–541, 1986.

790

[5] M. P. Kennedy and L. O. Chua, “Neural networks for nonlinear programming,” IEEE Trans. Circuits Syst., vol. CAS-35, pp. 554–562, 1988. [6] A. Rodr´ıguez-V´azquez, R. Dom´ınguez-Castro, A. Rueda, J. L. Huertas, E. S´anchez-Sinencio, “Nonlinear switched-capacitor ‘neural networks’ for optimization problems,” IEEE Trans. Circuits Syst., vol. 37, pp. 384–397, 1990. [7] C. Y. Maa and M. A. Shanblatt, “Linear and quadratic programming neural network analysis,” IEEE Trans. Neural Networks, vol. 3, pp. 580–594, 1992. [8] J. Wang, “Analysis and design of a recurrent neural network for linear programming,” IEEE Trans. Circuits Syst. I: Fundamental Theory Applicat., vol. 40, pp. 613–618, 1993. , “A deterministic annealing neural network for convex program[9] ming,” Neural Networks, vol. 7, pp. 629–641, 1994. [10] Y. Xia and J. Wang, “Neural network for solving linear programming problems with bounded variables,” IEEE Trans. Neural Networks, vol. 6, pp. 515–519, 1995. [11] S. P. Eberhardt, T. Duad, D. A. Kerns, T. X. Brown, and A. P. Thakoor, “Competitive neural architecture for hardware solution to the assignment problem,” Neural Networks, vol. 4, no. 4, pp. 431–442, 1991. [12] J. Wang and V. Chankong, “Recurrent neural networks for linear programming: Analysis and design principles,” Computers Operations Res., vol. 19, nos. 3/4, pp. 297–311, 1992. [13] J. Wang, “Analog neural network for solving the assignment problem,” Electron. Lett., vol. 28, no. 11, pp. 1047–1050, 1992.

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 8, NO. 3, MAY 1997

[14] W. J. Wolfe, J. M. MacMillan, G. Brady, R. Mathews, J. A. Rothman, M. D. Orosz, C. Anderson, and G. Alaghband, “Inhibitory grids and the assignment problem,” IEEE Trans. Neural Networks, vol. 4, pp. 319–331, 1993. [15] J. J. Kosowsky and A. L. Yuille, “The invisible hand algorithm: Solving the assignment problem with statistical physics,” Neural Networks, vol. 7, pp. 477–490, 1994. [16] K. Urahama, “Analog circuit for solving assignment problem,” IEEE Trans. Circuits Syst. I: Fundamental Theory Applicat., vol. 40, pp. 426–429, 1994. [17] P.-Y. Ting and R. A. Iltis, “Diffusion network architectures for implementation of gibbs sampler with applications to the assignment problem,” IEEE Trans. Neural Networks, vol. 5, pp. 622–638, 1994. [18] J. Wang, “Analysis and design of an analog sorting network,” IEEE Trans. Neural Networks, vol. 5, pp. 962–971, 1995. [19] H. E. Rauch and T. Winarske, “Neural networks for routing communication traffic,” IEEE Contr. Syst. Mag., vol. 8, no. 2, pp. 26–30, 1988. [20] M. K. M. Ali and F. Kamoun, “Neural networks for shortest path computation and routing in computer networks,” IEEE Trans. Neural Networks, vol. 4, pp. 941–954, 1993. [21] J. Wang, “A recurrent neural network for solving the shortest path problem,” IEEE Trans. Circuits Syst. I: Fundamental Theory Applicat., vol. 43, pp. 482–486, 1996. [22] A. J. Hoffman and H. M. Markowitz, “A note on shortest path, assignment, and transportation problems,” Naval Res. Logistics Quart., vol. 10, pp. 375–380, 1963.