Curved trajectory prediction using a self-organizing neural ... - CiteSeerX

Report 2 Downloads 29 Views
[September 1997, submitted for publication. Readers' comments are invited.]

Curved trajectory prediction using a self-organizing neural network

Jonathan A. Marshall

Viswanath Srikanth

September 1997 Department of Computer Science, CB 3175, Sitterson Hall University of North Carolina, Chapel Hill, NC 27599-3175, U.S.A. [email protected], 919-962-1887, fax 919-962-1799

Abstract Existing neural network models are capable of tracking linear trajectories of moving visual objects. This paper describes an additional neural mechanism, disfacilitation, that enhances the ability of a visual system to track curved trajectories. The added mechanism combines information about an object's trajectory with information about changes in the object's trajectory, to improve the estimates for the object's next probable location. Computational simulations are presented that show how the neural mechanism can learn to track the speed of objects and how the network operates to predict the trajectories of accelerating and decelerating objects.

Keywords Curve, target tracking, neural network, disfacilitation, vision, visual learning, motion perception. 1

1 Introduction The ability to track the trajectory of an object in motion is an important function in both human and computer vision: it can be used to improve localization accuracy for moving objects and even to reduce noise. How can a visual system track an object and at any point along the trajectory make the best estimate for the next position of the object? This paper describes a neural network that simulates some of the mechanisms that could aid a visual system in predictively tracking the trajectories of objects. The model suggests possible neural mechanisms that may exist in animal visual systems and may also be used to improve tracking in arti cial visual systems. Consider an object that passes through points a,b,d along its trajectory, as shown in Figure 1. When the object is at location a , it has already been moving along a linear trajectory for some time, and at this point, a visual system can generate a prediction (with the greatest probability) that the object will move to location b . When the object reaches location b , the strongest prediction succeeds. When the object is at location b , the visual system can predict that the object will move to location c . Suppose, however, that the object begins to curve as shown, changing direction through point d , leading to failure of the prediction at c . Most existing models for predictive tracking would continue to extrapolate linearly, predicting e as the next location, despite the failure of the previous prediction. The neural network simulation described in this paper uses prediction failures (such as the one at c ) to improve subsequent predictions by de ecting the predictive extrapolation (away from e and toward f ).

2 Tracking an object undergoing acceleration or deceleration In principle, tracking an object undergoing a curved motion is similar to tracking an object accelerating or decelerating. In either case, a variable (direction of motion or speed ) changes, 2

a

b

c d e f

Figure 1:

Path of an object. (See text for discussion.)

plotted against time.

2.1 Operating principle Figure 2 sketches a two-layer visual neural network as modeled in the work of Marshall (1990). The input layer receives visual data (e.g., preprocessed from edge or contrast detector units), and the output layer uses a erent excitatory, lateral excitatory, and lateral inhibitory interactions to integrate multiple motion measurements, thereby representing the trajectories of objects. The weights of the a erent excitatory connections from the input layer to the output layer neurons can self-organize in response to natural input motion sequences, to give each output layer neuron a local velocity preference (Fredericksen, 1993; Marshall, 1990). The output layer has lateral excitatory connections (Figure 2), which initially project from every neuron to the other neurons within some spatial radius. The lateral excitatory pathways are modeled as having a xed signal transmission latency. In biological neural networks, this time delay could occur in the connection pathways themselves or in integration times within the neurons; the model in this paper would work with either type of time delay. The weight of a lateral excitatory connection develops so that it corresponds approximately to the likelihood of the presynaptic (source) neuron's becoming activated just before the postsynaptic (target) neuron's activation. The output layer also has lateral inhibitory 3

Lateral excitatory connections

Lateral inhibitory connections

Output layer

Feedforward excitatory connections

Input layer

Figure 2:

Neural network with input layer, output layer, feedforward excitatory connections,

lateral excitatory connections, and lateral inhibitory connections. Only some of the neurons and connections are shown here; the other neurons and connections are similarly arranged.

connections, which initially project from each neuron to every other neuron, within some radius in the layer. The weights of the lateral inhibitory connections develop so that they are strongest between neurons that tend to receive simultaneous excitation (Marshall, 1990, 1995). The lateral inhibitory connections are modeled as having a zero or negligible signal transmission latency. The network is trained by exposing it to sequences if input data that simulate the visual motion of objects in a simpli ed natural environment. Because of inertia, straight paths are encountered more frequently than curved paths. The patterns presented cover a range of motion trajectories. When the network has gone through many presentations of the patterns, the lateral excitatory connection weights stabilize around values that approximately encode the motion sequence probability distributions (Marshall, 1990). These distributions predict the greatest probability for maintaining the inertial direction and speed of motion, and lesser probabilities for curves or accelerations or decelerations. If an object follows a curved trajectory, then predictions based on these distributions would be suboptimal since they 4

would always give the highest probability to a straight, linear trajectory. In the neural network simulation, the lateral signals that propagate over the lateral excitatory connections constitute the predictions of the network, by de nition. Thus, both neuron activations and lateral excitatory connection weights a ect the value of the prediction signals (lateral excitation) received by a neuron. The lateral excitatory connection weights from a neuron can be represented as a matrix of prediction probabilities W , where each element W represents the weight of the lateral excitatory connection from neuron i to neuron j . Each neuron is sensitive to a particular position, speed, and direction of motion. The predictions X pred can be generated from the actual current activations X actual as follows: ij

X pred (t +  ) =

X

X actual (t)  W ;

j

ij

i

i

where  represents a small, xed time interval.

2.2 Disfacilitation: A new mechanism that improves predictions The addition of a new mechanism, called disfacilitation , causes the peak of the network's overall predictions to follow the trajectory of an object more closely. The term \disfacilitation" has been used previously to refer to the withdrawal of the excitation (facilitation) felt by a neuron (Bloom eld, 1994; Schupp et al., 1994; Singer, 1982; Tsukahara et al., 1965; Woodward et al., 1995). Figure 3 shows the operation of the mechanism when an object following a curved path is tracked. The same concept is applicable for tracking an object that is undergoing accleration/deceleration. The operation can be described by the following equation:

X pred (t +  ) =

X

X actual (t)  W ? D i

j

i

X

ij





max 0; X pred (t) ? X actual (t)  W : i

i

ij

i

When the di erence between the predicted and actual activations X pred (t) ? X actual (t) is positive, it indicates a \failed" prediction, and neuron i is said to experience a shortfall . This shortfall is used by the network to modulate the net amount of prediction received by other neurons. Disfacilitation modulates the net prediction by subtracting some amount from i

5

i

De

(A)

Figure 3:

(B)

(C)

on cti fle

Mismatch detected

Improved prediction

Tracking along curved trajectories. (A) Object moves along a curved trajectory

(thick black line). A local motion detector (gray ellipse) can detect the motion and generate a linear prediction of the object's future position (gray arrow). However, a mismatch occurs: the prediction di ers considerably from the actual subsequent trajectory (thin black line). (B) A larger-scale local motion detector would generate worse predictions. (C) The new neural mechanism dynamically compares the predictions to the actual object trajectories. When a mismatch occurs, the network releases a \disfacilitatory de ection" pulse (shading), which is strongest in the direction of the failed prediction. The disfacilitation represents a withdrawal of excitation to neurons in the de ection region. This altered signal biases the new predictions (second gray arrow) away from the failed prediction and toward the actual trajectory of the object.

the lateral excitation received by the neurons, so that a better prediction can be obtained. Because the mechanism reduces excitation, or facilitation, to some neurons, it is referred to as disfacilitation. The new set of predictions is the net result of the disfacilitation and the lateral excitation felt at any particular neuron. It will now be shown how this mechanism can improve the prediction distribution. Figure 4 shows a curved path followed by an object. As the object follows the curve, neurons selective for the position, speed, and direction of the moving object become activated. When neuron a becomes active, the network predicts that neuron b will become 6

a

c

b

d e f

Figure 4: Neuronal signaling along a curved trajectory. (See text for discussion.) active next. This prediction is successful as the object reaches b. The network then strongly predicts that neuron c will become active, but that prediction is not ful lled as the object starts to curve, activating neuron d. The object, continuing on its curved path, causes neuron f to become active next. Without disfacilitation, the network would have strongly predicted that neuron e would become active. This behavior of strongly predicting a position along a straight line path extrapolated along the preferred direction of the previously active neuron would continue at all points along the object's trajectory. Disfacilitation improves the behavior exhibited. The aim is to predict path d ! f more strongly and path d ! e less strongly, and if possible, even to cause d ! f to be the strongest prediction. In this example, when the object moves from position b to d, the network experiences the failure of its strongest prediction, b ! c. The neuron that 7

receives the strongest prediction but fails to become activated (c), exerts disfacilitation along its output lateral excitatory pathways. The disfacilitation is felt most strongly by neurons farther along the same linear trajectory. Both neurons e and f feel the e ect of disfacilitation (less strongly), but D(c ! e) > D(c ! f ), where D is the amount disfacilitation felt. The e ect of disfacilitation is felt simultaneously with the lateral excitations, causing the net predictions to become E(d ! e) D(c ! e) and E(d ! f ) D(c ! f ), where E represents the lateral excitatory signals and the operation refers to the subtractive (or similar) manner in which disfacilitation interacts with the lateral excitation, as described in section 3.5. Because D(c ! e) > D(c ! f ), the di erence in the net peak prediction values for the two positions decreases. With an appropriate set of parameters, it is possible to obtain a complete switch; that is, E(d ! f ) D(c ! f ) > E(d ! e) D(c ! e). In other words, by simply using the network's structure and the success or failure of the network predictions, the network's future predictions can be modi ed to improve tracking. This method corresponds to simply including in the predictions a term based on the derivative of the motions.

3 Simulation This section presents a simple simulation to illustrate the self-organization of a neural network that predictively tracks an object, and the operation of the disfacilitation mechanism for improving the predictions of the network. The network is trained on motion patterns of a point object moving on a linear trajectory along one dimension with varying speeds (i.e., accelerating and decelerating intermittently). In the simulation, accelerations and decelerations are approximated by discrete steps (of 0, 1, or ?1), and the speed is a whole number (1, 2, or 3). It is assumed that accelerations and decelerations usually occur episodically, over more than one time step; thus, if an object accelerates during one time step, it is likely to continue accelerating during the next time step.

8

3.1 Initial network structure The network consists of an input layer of neurons, an output layer of neurons, feedforward excitatory connections (from input to output layer), lateral excitatory connections (in the output layer), and lateral inhibitory connections (in the output layer). Input Layer: This layer consists of 24 input neurons, displayed in an 8  3 grid, that project feedforward connections to the output layer. These neurons represent the output of some prior processing stages that detect local motion. The neurons in the input layer are motion-selective; they respond selectively to stimuli at di erent positions moving at di erent speeds. The development of such selectivities has been demonstrated before (Fredericksen, 1993; Marshall, 1990). Output Layer: There are 24 output layer neurons, which are also displayed in a grid representing 8 positions  3 speeds. One row of eight neurons is sensitive to a speed of +1, a second row to +2, and the third to +3. These correspond directly to the position and speed selectivities of the input neurons. The output neurons track the object as it moves along a single dimension (rightward along the rows). The object's motion is represented by the activation of the appropriate neurons across the rows (which represent di erent speed selectivities). Feedforward excitatory connections: These connections project from the input to the output layer of neurons, preserving two dimensional neighborhood relationships, or retinotopy. In the simulation, there is a one-to-one mapping between the neurons in the two layers. Because the simulations focus on the self-organization of the lateral connections, the weights of the feedforward connections are set to 1 between corresponding neurons and to 0 between other neurons and are xed through the simulation. Lateral excitatory connections: In the simulation, these connections are time-delayed: any signal takes one time unit to travel through the connection. These connections project from every neuron in the output layer to the 23 other neurons in that layer. The weights of these connections change according to a learning rule and the training patterns. All the lateral excitatory connections are set to an initial weight of approximately 0.01 (i.e., quite 9

weak). Lateral inhibitory connections: These connections also project from every neuron in the output layer to every other neuron in that layer. The weights of these connections also change in accordance with a learning rule and the training patterns, with the initial weight set to about 0:25. In the simulation, these connections are modeled as having a zero or negligibly small time-delay.

3.2 Input data The network is presented with sequences of input patterns that represent object motion, and these patterns are generated based on a probability distribution that gives a higher probability for an object to maintain its speed than to accelerate or decelerate. Thus if an object is moving at a speed of +2, then its speed at the next moment is more likely to be +2 than +1 or +3. But when an object accelerates from a speed of +1 to +2, the probability of the object accelerating further to a speed of +3 is greater than its probability of maintaining a speed of +2. This was done to simulate persistence of acceleration, as encountered in our natural environment. The same property holds for deceleration from +3 to +2 units of speed. Because of the limitation in the size of the simulation, an object moving at a speed of +1 or +3 has only two options: either to maintain speed or to accelerate (in the case of +1 speed) or decelerate (in the case of +3 speed). However, when it is moving at a speed of +2, it can either accelerate or decelerate. The probability values used for the generation of input data are illustrated in Figure 5. Both cases, acceleration/deceleration and constant speed, are described. As shown in tables a , c , and e , if an object has been previously moving at a constant speed, there was a higher probability (30%) to maintain that speed than to change speeds. Occasionally, a neuron that is maximally sensitive to a speed of +2 units may have become activated at a positional displacement of +1 unit as well (15%, table c ). This property is modeled because neurons are not absolute detectors and instead have a range of inputs to which they are sensitive. Another point of interest in the tables is the values of probabilities in the 10

(A) +1

0

0

+2

0

0

+3

0

(C) +1

0

0

0

+2

0

0

0

+3

0

0

(E) +1

0

0

0

0

+2

0

0

0

0

+3

0

0

0

0

0

0

(B) +1

0

0

0

0

0

0

+2

0

0

0

0

0

0

+3

0

0

0

(D) +1

0

0

0

0

0

+2

0

0

0

0

0

+3

0

0

.0375 .0750 .0375

.0625 .1250 .0625

.1500 .3000 .1500

0

0

.0625 .1250 .0625

.0625 .1250 .0625

.1250 .2500 .1250

0

0

0

.0500 .1000 .0500

.1500 .3000 .1500

.0500 .1000 .0500

0

0

0

0

0

0

.1250 .2500 .1250

.0625 .1250 .0625

.0625 .1250 .0625

0

0

0

0

0

0

0

.1500 .3000 .1500

.0625 .1250 .0625

.0375 .0750 .0375

0

0 0

Figure 5. Tables showing the probability of the next object location (numerical values indicate the probability for the position of the object at the following moment). Each box represents a neuron that encodes a particular position (0{7) and a particular speed (+1, +2, +3). The arrows indicate the motion of the object from its previous to its current location. The cases of acceleration or deceleration of 2 units (+1 to +3 or +3 to +1) are not shown; their prediction probabilities are the same as those of a constant speed motion. The same applies for the cases where a neuron maximally sensitive to a particular speed (say +1) becomes activated in response to an object moving at a di erent speed (say +2).

case of acceleration or deceleration from a speed of +1 or +3 respectively. When the object accelerates from +1 to +2 (Figure 5, table e ), the probability for further acceleration to +3 was 25%, for maintaining speed +2 was 12.5%, and for deceleration back to +1 was 12.5%. The input data consisted of a sequence of inputs to the 24 neurons, with a 1 indicating the position of the object, the remaining inputs being set to 0. A pseudo-random number generator was used to choose the initial positions, movements, and durations of the objects, with the probabilistic logic described above.

11

3.3 Learning equations In the simulation, the lateral excitatory learning rule was a standard instar Hebbian rule (Grossberg, 1982): dW + =  f x (t) ?W + + hx (t ?  ); dt ij

j

i

ij

where W + is the excitatory weight from neuron i to neuron j , x is the activation of neuron j , x is the activation of neuron i, f and h are half-recti ed nondecreasing functions, and  is a small learning rate parameter. This is an instar learning rule (Grossberg, 1982): the learning rate is controlled by the activation of the postsynaptic neuron. The learning rule for lateral inhibitory connections can be expressed as j

ij

i

dW ? =  gx (t) ?W ? + qB (t) + l L (t); dt where W ? is the inhibitory weight from neuron i to neuron j ,  is the learning rate, and q is a half-recti ed nondecreasing function. The expressions B and L denote the total feedforward and lateral excitation received by neuron i, respectively: B = P 2input layer max(0; x )W +; L = P 2output layer max(0; x )W + . When a neuron is active, its input inhibitory ij

j

i

ij

i

ij

i

i

i

i

k

k

k

k

ki

ki

connections from excited neurons tend to become stronger (more inhibitory), and its input inhibitory connections from unexcited neurons tend to become weaker.

3.4 Activation equation { without disfacilitation Without disfacilitation, the following shunting equation (Grossberg, 1972) was used to govern the amount of activation of a model neuron:

dx = ?A x + (B ? x ) E ? (C + x ) I ; dt where I = P W ? max(0; x ) is the total inhibition received by neuron i, and E =   B (t) 1 + l L (t) is a function of the total excitatory input into neuron i in which lateral i

i

i

i

j

ji

i

i

i

j

i

i

i

excitation can amplify the e ect of feedforward excitation but cannot alone activate the neuron (Hirsch & Gilbert, 1991). The rst term, ?A x , is a decay term that causes the activation of a neuron to fall to zero in the absence of any excitation. The second i

12

term, (B ? x ) E , shunts the excitatory input E to the neuron, limiting the maximum activation value to B . The third term, ? (C + x ) I , shunts the inhibitory input I to the neuron, limiting the minimum activation of a neuron to value ?C . i

i

i

i

i

i

3.5 Activation equation { with disfacilitation Disfacilitation can be implemented in a computer simulation as either a decrement of excitation or an increment of inhibition. The latter method is used in the simulation here. Neurons that experience a shortfall send a withdrawal-of-excitation signal along their lateral excitatory connections, which forms the disfacilitation term. This term is thus the sum of the time-delayed lateral excitations received at a neuron from the neurons that had failed to become active but had received some lateral excitation. Disfacilitation is incorporated via the activation equation. A disfacilitation term  (t) was added: dx = ?A x + (B ? x ) E ? (C + x ) (I + D ); dt i

i

i

i

i

i

i

i

where parameter D governs the magnitude of the disfacilitation e ect, and  (t) =   P max 0; X pred (t ?  ) ? X actual (t ?  )  W . The disfacilitation at time t is based on the shortfall at time t ?  . i

j

j

j

ji

4 Results The results from the simulation are presented in Figures 6 through 8.

4.1 Self-organization of connection weights The lateral excitatory connection weights at the start of the simulation are approximately equal to one another, up to a 1% randomness factor. As training continues, the excitatory connection weights between successive neurons along the same row become stronger than between neurons across the rows (Figure 6). The lateral inhibitory connections weights are 13

Figure 6.

Sketch of lateral excitatory connection weights (solid arrows) from a neuron, after

training, and lateral inhibitory connection weights (dashed arrows), after training.

also approximately equal at the start of the simulation, and as training proceeds, these weights become approximately proportional to the probability of simultaneous activation of pairs of neurons. Thus the inhibition is stronger between neurons that receive simultaneous excitation and weaker between other neurons (Figure 6).

4.2 Network operation When a simulated object, moving at a uniform speed of +1, starts to accelerate, a neuron in the next row (+2) becomes activated. The strongest lateral excitation is received by a neuron that is selective for the previous speed of the object (+1) (Figure 7A), and thus the strongest prediction of the network fails. A neural network operating without disfacilitation does not make any adjustments for this failure and instead predicts that another neuron farther along the row of the currently active neuron (+2) will become activated next (Figure 7B). The object, however, continues to accelerate. With disfacilitation added, when the object accelerates (from +1 to +2), the disfacilitation is strong farther down the row of the failed prediction (of +1 neurons), weaker at adjacent rows (+2 neurons) and weakest farther away (+3 neurons) (Figure 7C). Disfacilitation thus shifts the peak prediction away from the row of the failed prediction (away from +1) (Figure 7D). This shift brings the peak prediction and actual location of the 14

Speed

(A)

Time=1

(B)

Speed

X Position

Time=2

(C)

Speed

X Position

Time=2

(D)

Speed

X Position

Time=2

X Position

Figure 7.

Acceleration simulation results. (A) Predictions by a neuron active at time 1. The

active neuron (at tip of arrow) represents a speed of 1. The squares show the predictions from this neuron. The size of the squares represents the strength of the prediction. (B) Predictions by the neuron active at time 2. The active neuron (arrow) represents a speed of +2. The strongest prediction is sent to another neuron that represents the +2 speed. (C) Disfacilitation signals sent out by the neuron that was most strongly predicted at time 1 (Figure 7A). (D) Modi ed net predictions from the active neuron (at time 2). The strongest net prediction now goes to a speed +3 neuron, rather than to a speed +2 neuron (compare to Figure 7B).

15

object to a closer match. The prediction probability for maintaining the speed at +1 unit is higher than the prediction probability for maintaining the speed at +2 units. This di erence is an artifact of the simulation and arises because an object moving at a speed of +2 units can either accelerate or decelerate with equal probability (Figure 5). This artifact does not impair the operation of the network. In the case of deceleration, an object initially moving at speed +3 (Figure 8A) decelerates to speed +2 and continues to decelerate to speed +1. A neural network without disfacilitation would have predicted the maintenance of constant velocity at all times (Figure 8B). With the disfacilitation (Figure 8C), however, the peak prediction shifts toward +1 (Figure 8D), which yields an improved prediction.

5 Discussion 5.1 Extension to multiple dimensions: The windshield wiper This section describes how the principles illustrated above may be generalized to solve problems that have more than one dimension undergoing change at a time. The \windshield wiper" problem consists of tracking the change in position and orientation of an object. A windshield wiper executes a curved trajectory on the windshield, and it simultaneously changes its orientation (Figure 9A). The accuracy of a neural network that can predictively track both the position and orientation of the wiper can be improved by adding disfacilitation. The neurons in such a network are selective for both position and orientation, like cortical simple cells (Hubel & Wiesel, 1962). Thus, the lateral connection weights between the neurons would develop based on both the motion and orientation of the neurons, with connections within trajectories and within orientations developing most strongly. After such connections have developed, the network operation would be very similar to the cases discussed above. The position would be predicted as described above, and any 16

Speed

(A)

Time=1

(B)

Speed

X Position

Time=2

(C)

Speed

X Position

Time=2

(D)

Speed

X Position

Time=2

X Position

Figure 8.

Deceleration simulation results. (A) Predictions by a neuron active at time 1. The

active neuron (at tip of arrow) represents a speed of +3. The squares show the predictions from this neuron. (B) Predictions at time 2. The active neuron represents a speed of +2. (C) Disfacilitation signals sent out by the neuron that was most strongly predicted at the time 1 (Figure 8A). (D) Modi ed net predictions.

17

(A) Incorrect trajectory predictions

(B)

Mismatch detected

(C)

Corrected trajectory prediction

Figure 9: (A) Windshield wiper motion. (B) Incorrect trajectory predictions made without the disfacilitation mechanism. (C) The correction caused by the disfacilitation mechanism.

18

discrepancies in orientation (Figure 9B) would be corrected by sending out disfacilitation signals along the lateral connections which would tend both to de ect the predicted position and to tilt the predicted orientation correctly (Figure 9C). This reduces the di erence in the predicted and the actual orientation and position of the object, thus improving the prediction performance.

5.2 Interpolation in structure from motion Saidpour, Braunstein, and Ho man (1992) described human visual psychophysics experiments in which subjects identi ed the 3-D shape of the completion of an incomplete image of a rotating cylinder de ned by moving random dots (Figure 10A). On average, the completions judged by the subjects were between (Figure 10C) a straight tangential completion (Figure 10D) and a circularly rounded completion (Figure 10B) of the cylinder surface. The images were viewed monocularly by the subjects. This result can potentially be simulated by a neural network with disfacilitation. The straight tangential prediction that would be generated at the edge of the incomplete surface could be modi ed by disfacilitation to favor continued curvature of the surface. The disfacilitation could be weak, resulting in a partial correction (Figure 10C) of the perceived curvature completion.

5.3 Interpolation property Disfacilitation continues to perform properly even at speeds between the discrete speeds of the simulated neuron preferences. Suppose that an object is accelerating from a speed of +2 units to +2.5 units and that a neural network has neurons sensitive maximally to speeds of +2 and +3, but not +2.5. Assume that speed +2.5 can be represented by the network by simultaneous partial activation of both the +2 and +3 neurons. The strongest prediction at the previous moment would be for a +2 neuron to become fully active. However, since it becomes only partially active, it sends out disfacilitation signals that de ect the peak of the next set of predictions away from +2 unit neurons toward the +3 unit neurons. The +3 unit 19

AAAAAAAA AAAAAAAA d AAAAAAAA AAAA AAAAAAAA AA AAA cAA AAAAAAAA AAAA AAAAAAAA AAA AAAAAAAA AAAA AAAA AAAAAAAA AAA AAA bAAAAAAAA AAAA AAAAAAAA AAAAA AAAA AAAA AAA AAAA AAA AAA AAA AAAA AAAAAAAA AAAAA AAAA AAAAAAAA AAAA AAA AAA AAA AAA AAAA AAAAAAAA AAAAA AAAAAAAA AAA AAA AAA AAA AAA AAAAAAAA AAAAA AAAAAAAA AAA AAA AAA a aAAA AAA AAA AAA AAA AAA AAA AAA AAA AA AAA AA AAA AA AAA AA AA AA AA AA Figure 10.

Results of experiment that tests human subjects' estimate of the location of the

completion of the apparent curved surface of a cylinder (Saidpour et al., 1992). A section of the surface of the cylinder (A) is invisible (B). The intersection of extrapolated invisible tangents (dashed lines), drawn from the points where the cylinder surface becomes invisible, is (D). Subjects' estimates of the surface shape tended to be in between (C).

20

neurons would have been only weakly predicted to become active at the previous moment, and thus disfacilitation signals from +3 neurons would be very weak or non-existent. Thus the net peak prediction shifts to predict a speed closer to +3, thus improving the object tracking.

6 Conclusions This paper describes a new model for an operating mechanism of neural systems, using disfacilitation to enhance predictions with respect to changes in the input stimulus trajectories. Disfacilitation reduces the lateral excitation (prediction signals) received by some neurons, de ecting the peak of the prediction signal distribution away from previously failed predictions. The operation occurs in real-time.

6.1 Open questions Some of the open questions regarding the use of disfacilitation in a visual system are:

 When a visual system tracks more than one object, disfacilitation may interfere with the tracking of nearby objects, since the operation of disfacilitation directly a ects the lateral excitation of neurons. How could the interference be minimized or eliminated? Is evidence of such interference found in human visual perception?

 Does disfacilitation provide a net improvement in curved trajectory prediction, across the distribution of images in natural visual environments?

 Can disfacilitation be used to complete the images of incomplete static images (e.g, curved illusory contours)?

 Does curved motion produce a dynamic warping of receptive elds of animal visual neurons? If so, is a disfacilitatory mechanism involved? 21

6.2 Implications for theories of curvature detection Disfacilitation works through the existing neural network structure and the failure/success of the predictions made by the network while tracking an object. This allows modi cations in the predictions that depend on the particular trajectory followed by an object. Disfacilitation improves the predictive tracking of a curved trajectory without the need for any additional neurons, connections, or learning rules. Disfacilitation does not eliminate the need for \curvature"-selective (Dobbins, Zucker, & Cynader, 1987) neurons. Disfacilitation can enhance the predictive tracking performance of a visual system. The performance of a visual system even with curvature-sensitive neurons would be enhanced by the addition of disfacilitation; predictions under conditions of changing curvature and curvatures outside the normal response range of the neurons would be made more accurate.

Acknowledgements Supported in part by the Oce of Naval Research (N-00014-93-1-0208) and by the Whitaker Foundation (Special Opportunity Grant). The authors thank Dr. William D. Ross for helpful comments.

22

References Bloom eld SA (1994) Orientation-sensitive amacrine and ganglion cells in the rabbit retina. Journal of Neurophysiology 71:1672{1691. Dobbins A, Zucker SW, Cynader MS (1987) Endstopped neurons in the visual cortex as a substrate for calculating curvature. Nature 329:438{441. Fredericksen RE (1993) The Biological Computation of Visual Motion . PhD dissertation, Department of Computer Science, University of North Carolina at Chapel Hill. Grossberg S (1982) Studies of Mind and Brain . Boston: Reidel Press. Hirsch J, Gilbert CD (1991) Synaptic physiology of horizontal connections in the cat's visual cortex. Journal of Neuroscience 11:1800{1809. Hubel DH, Wiesel TN (1962) Receptive elds, binocular interactions, and functional architecture in cat's visual cortex. Journal of Physiology 160:106{154. Marshall JA (1990) Self-organizing neural networks for perception of visual motion. Neural Networks 3:45{74. Marshall JA (1995) Adaptive perceptual pattern recognition by self-organizing neural networks: Context, uncertainty, multiplicity, and scale. Neural Networks 8:335{362. Saidpour A, Braunstein ML, Ho man DD (1992). Interpolation in structure from motion. Perception & Psychophysics 51:105{117. Schupp HT, Lutzenberger W, Rau H, Birbaumer N (1994) Positive shifts of event-related potentials: A state of cortical disfacilitation as re ected by the startle re ex probe. Eletroencephalography & Clinical Neurophysiology 90:135{144. Singer W (1982) The role of attention in developmental plasticity. Human Neurobiology 1:41{43. 23

Tsukahara N, Toyama K, Kosaka K, Udo M (1965) `Disfacilitation' of red nucleus neurones. Experientia 21:544-545. Woodward DJ, Kirilov AB, Myre CD, Sawyer SF (1995) Neurostriatal circuitry as a scalar memory: Modeling and ensemble neuron recording. Models of Information Processing in the Basal Ganglia , JC Houk, J Davis (eds), MIT Press, 315{336.

24

Appendix: Simulation implementation details The neural network was presented with 1,625,000 patterns, where a pattern is one object position from a sequence of object positions. The initial position was chosen randomly. After every 2400 time units, the input was cleared to all zeros for two time units, and then a new random position was chosen as a starting point. The input sequence was repeated every 25,000 time units. The connection weights in the network stabilized at their nal values after about 625,000 pattern presentations. The remaining patterns were presented to verify the stability of the weights. The values of the parameters were A = 5:0, B = 2:0, C = ?0:1, l = 0:5, = 210, = 5:0, D = 1:5,  = 1, H = 1:5, Q = 2:0,  = 0:001,  = 0:0005. Also, f (x ) = } max(x ; 0)2, h(x) = H max(A x=( (B ? x)); 0), and q(x) = Q max(x; 0). Each input pattern was presented for 1.875 time units, with an integration step of 0.004. An integration step of 0.1 was used for the learning equations. j

j

25