Flow Diagrams of the Quadratic Neural Network | SpringerLink

Comment

Report 2 Downloads 59 Views

Flow Diagrams of the Quadratic Neural Network David R.C. Dominguez1 , E. Korutcheva2 , W.K. Theumann3 , and R. Erichsen Jr.3 1

E.T.S. Inform´ atica, Universidad Aut´ onoma de Madrid, Cantoblanco 28049 Madrid, Spain, [email protected] 2 Departamento de F´ısica Fundamental, UNED, c/Senda del Rey No 9, 28080 Madrid, Spain 3 Instituto de F´ısica, Universidade Federal do Rio Grande do Sul, C.Postal 15051, 91501-970 Porto Alegre, Brazil

Abstract. The macroscopic dynamics of an extremely diluted threestate neural network based on mutual information and mean-ﬁeld theory arguments is studied in order to establish the stability of the stationary states. Results are presented in terms of the pattern-recognition overlap, the neural activity, and the activity-overlap. It is shown that the presence of synaptic noise is essential for the stability of states that recognize only the active patterns when the full structure of the patterns is not recognizable. Basins of attraction of considerable size are obtained in all cases for a not too large storage ratio of patterns.

1

Introduction

In a recent paper [1], information theory was used to obtain an eﬀective energy function that maximizes the Mutual Information of a three-state neural network. Neurons which have other structure than the binary state neurons are relevant from both the biological and technological point of view. The performance of the stationary states of an extremely diluted version of the network in mean-ﬁeld theory revealed the presence of either retrieval, quadrupolar or zero phases. A three-state neural network is deﬁned by a set of µ = 1, ..., p embedded ternary patterns, {ξiµ ∈ [0, ±1]} on sites i = 1, ..., N , which are assumed here to be independent random variables that follow the probability distribution p(ξiµ ) = aδ(|ξiµ |2 − 1) + (1 − a)δ(ξiµ ),

(1)

where a is the activity of the patterns (ξiµ = 0 are the inactive ones). Accordingly, the neuron states are three-state dynamical variables, deﬁned as σi ∈ {0, ±1}, i = 1, ..., N and coupled to other neurons through synaptic connections, for our purpose, of a Hebbian-like form [1]. The active states, σi = ±1, become accessible by means of an eﬀective threshold.

Permanent Address: G.Nadjakov Inst. Solid State Physics, Bulgarian Academy of Sciences, 1784 Soﬁa, Bulgaria

J.R. Dorronsoro (Ed.): ICANN 2002, LNCS 2415, pp. 129–134, 2002. c Springer-Verlag Berlin Heidelberg 2002

130

D.R.C. Dominguez et al.

The pattern retrieval task becomes successful if the state of the neuron {σi } matches a given pattern {ξiµ }. The measure of the quality of retrieval that we use here is the mutual information, which is a function of the conditional distribution of the neuron states given the patterns, p(σ|ξ). The order parameters needed to describe this information are the large-N (thermodynamic) limits of the standard overlap of the µth pattern with the neuron state, mµN ≡

ξ 1 µ ξ σi → m = σ σ|ξ ξ , aN i i a

(2)

1 |σit |2 → q = σ 2 σ|ξ ξ , N i

(3)

the neural activity, qN t ≡

and the so called activity-overlap[4,5], nµN t ≡

N ξ2 1 |σit |2 |ξiµ |2 → n = σ 2 σ|ξ ξ . aN i a

(4)

The brackets denote averages over the probability distributions. A study of the stability of the stationary states, as well as the dynamic evolution of the network, with particular emphasis on the size of the basins of attraction, has not been carried out so far, and the purpose of the present work is to report new results on these issues.

2

Mutual Information

The Mutual Information between patterns and neurons is deﬁned as I[σ; ξ] = ln[p(σ|ξ)/p(σ)] σ|ξ ξ , regarding the patterns as the input and the neuron states as the output of the channel at each time step [2,3]. I[σ; ξ] can also be written as the entropy of the output substracted from the conditional entropy (so-called equivocation term). For the conditional probability we assume the form [1] p(σ|ξ) = (sξ + mξσ)δ(σ 2 − 1) + (1 − sξ )δ(σ)

(5)

with sξ = s + (n − q)ξ 2 /(1 − a) and s = (q − na)/(1 − a). We search for an energy function which is symmetric in any permutation of the patterns ξ µ , since they are not known initially to the retrieval process, and we assume that the initial retrieval of any pattern is weak, i.e. the overlap mµ ∼ 0 [1]. For general a and q, also the variable σ 2 is initially almost independent of (ξ µ )2 , so that nµ ∼ q. Hence, the parameter lµ ≡ (nµ − q)/(1 − a) =< σ 2 η µ >, η µ ≡ ((ξ µ )2 − a)/(a(1 − a)), also vanishes initially in this limit. Note that this parameter is a recognition of a ﬂuctuation in (ξ µ )2 by the binary state variable σ2 . An expansion of the mutual information around mµ = 0 and lµ = 0 thus 2 2 gives, initially, when a ∼ q, I µ ≈ 21 (mµ ) + 12 (lµ ) , and the total information of

Flow Diagrams of the Quadratic Neural Network

131

the network will be given by summing over all the patterns IpN = N µ I µ . An energy function H = −I that rules the network learning and retrieval dynamics is obtained from this expansion[1]. The dynamics reads σit+1 = sign(hti )Θ(|hti |+θit ) with the neuron potential and the eﬀective threshold given respectively by hti = θit =

3

j Jij σjt , Jij ≡

p 1 µ µ ξ ξ , a2 N µ=1 i j

(6)

2 j Kij σjt , Kij ≡

p 1 µ µ η η . N µ=1 i j

(7)

Macrodynamics for the Diluted Network

The asymptotic macrodynamics for the parameters in Eqs.(2),(3) and (4) in the extremely diluted version of the model, follows the single-step evolution equations, exact in the large-N limit, for each time step t [1], ∆t mt lt mt+1 = DΦ(y) DΦ(z)Fβ ( + y∆t ; + z ), (8) a a 1−a qt = ant + (1 − a)st , (9) ∆t mt lt + y∆t ; + z ), (10) nt+1 = DΦ(y) DΦ(z)Gβ ( a a 1−a for the overlap, the neural activity, and the activity overlap, respectively, where ∆t lt +z ), (11) st+1 ≡ DΦ(y) DΦ(z)Gβ (y∆t ; − 1−a 1−a deﬁned above describes the wrong matches between the active neurons and patterns. Here, DΦ(y) and DΦ(z) are Gaussian probability distributions for random variables y and z with zero mean and unit variance, whereas ∆t 2 = αqt /a2 , in which α = p/N is the storage ratio of patterns. The functions Fβ (h, θ) =

1 βθ 2 2e sinh(βh), Gβ (h, θ) = eβθ cosh(βh) Z Z

(12)

with Z = 1 + 2eβθ cosh(βh), are the mean σt and the square mean σt2 of the neuron states over the synaptic noise with parameter β = a/T , respectively.

4

Results

There are two main aspects in which we are interested in the present work. One, is the nature of the possible phases in the order-parameter space (m, l, q), for given pattern activity a, storage ratio α and synaptic noise parameter (temperature) T . These are results to be extracted from the stationary states of the network and some have been presented before [1]. They concern a retrieval phase R(m =

132

D.R.C. Dominguez et al. m0=1; l0=1; q0=a

a=0.8; t