Dynamical systems in the sensorimotor loop - Poramate Manoonpong

Report 1 Downloads 31 Views
Dynamical systems in the sensorimotor loop: On the interrelation between internal and external mechanims of evolved robot behavior Martin H¨ ulse1 , Steffen Wischmann2 , Poramate Manoonpong2 , Arndt von Twickel3 , Frank Pasemann4 1

2

4

University of Wales, Aberystwyth, UK BCCN, University of G¨ ottingen, Germany 3 University of Osnabr¨ uck, Germany Fraunhofer IAIS, Sankt Augustin, Germany

Abstract. This case study demonstrates how the synthesis and the analysis of minimal recurrent neural robot control provide insights into the exploration of embodiment. By using structural evolution, minimal recurrent neural networks of general type were evolved for behavior control. The small size of the neural structures facilitates thorough investigations of behavior relevant neural dynamics and how they relate to interactions of robots within the sensorimotor loop. We argue that a clarification of dynamical neural control mechanisms in a reasonable depth allows quantitative statements about the effects of the sensorimotor loop and suggests general qualitative implications about the embodiment of autonomous robots and biological systems as well.

1

Introduction

The framework of embodied artificial intelligence has impressively demonstrated that problems in behavior control of autonomous robots seem to be very hard if approached from a mere computational perspective, but turn out to be surprisingly simple when characteristics of the sensorimotor loop are taken appropriately into account [1]. The challenge is that we usually do not know a priori what “appropriate” means, because the sensorimotor loop involves all the physical properties of the robot (inertia, friction, resonances, shape, etc.) as well as its interaction with the world. Therefore, Evolutionary Robotics (ER) is proposed as a promising testbed for studying the power of embodiment [2, 3]. Artificial evolution provides the exploration of hitherto unknown and efficient solutions by reducing prejudices and predispositions made by a human designer [4]. As an example, Nolfi [5] describes the emergence of modularity in evolved neural control, which does not correspond to task decomposition as a human observer would assume. Based on such networks, Ziemke [6] emphasizes the relevance of recurrent neural networks (RNNs) in the context of multi-functional and context-sensitive behavior control. In contrast, Suzuki et al. [7] demonstrate

how a simple feed-forward structure in conjunction with robot-environment interactions realizes robust and adaptive behavior control through complex visual sensorimotor mappings. Within the realm of feed-forward and recurrent neural control quantitative statements about the properties of the sensorimotor loop are needed [8]. It should be clarified where recurrent neural control structures are necessary and where simple feed-forward mappings are sufficient if the body, the dynamics of the environment, and the action-perception processes of a robot are taken into account. The difficulty in deriving qualitative statements from the effects of the sensorimotor loop is twofold. On the one hand, it is impossible to find a formal description of the sensorimotor loop including all relevant aspects. On the other hand, if RNNs are used for complex behavior control, usually only the parameters, but not the structure, of a predefined neural network are optimized by evolution. In the majority of the cases, the resulting control structures are high dimensional systems. But high dimensionality makes it practically infeasible to clarify whether complex behavior is basically generated by the control structure or results from robot-environment interactions. While the first point let us conclude that qualitative statements about the impact of the sensorimotor loop on robot control can be made only indirectly, that is, based on a reasonable understanding of the evolved neurodynamics. The second aspect, namely high dimensionality, seems to counter it. The objective of this paper is to introduce a strategy in ER supporting this approach termed as synthesis and analysis of minimal recurrent neural robot control. Further on, we will give representative examples where an application of this strategy has provided us with enlightening examples demonstrating the importance of the sensorimotor loop on behavior control for autonomous mobile robots. The experiments will show how robot-environment interactions give rise to integrated and induced oscillations, the use of transient effects, and the emergence of rhythms and behavior coordination. All these examples show how complex behavior relevant dynamics provided by RNNs are modulated by the sensorimotor loop.

2

Synthesis and analysis of minimal recurrent neural controllers

We are using a standard additive neuron model with sigmoidal transfer function P f (x) and discrete time dynamics: ai (t + 1) = Θi + nj=0 wij · f (aj (t)) , i = 1, . . . , n , where ai is the activation of neuron i, wij the weight of the synapse projecting from neuron j, and Θi the bias term. Already small recurrent networks of this type can generate complex dynamics [9]. That’s why we apply an evolutionary algorithm called EN S 3 (evolution of neural systems by stochastic synthesis) to evolve neural connectivity structure (hidden neurons and synapses) and optimize the corresponding parameters (weight and bias terms) at the same time. By modifying certain stochastic variation operators, such as the insertion and deletion probability for structural elements, during evolution we are able

1.0

−30

−40

I5

−60

I1

−15

I3

I2 0 −5

24

22 −0.5 O1

12

−30

I1 0.0 220

O2

230

240

1.0 O2 Output

−6

0

O2

Output

I4

−18

I4

0.0 120

140 Time [steps]

160

Fig. 1. A reactive light seeking controller (f (x) = σ(x) = 1/(1 + e−x )) utilizing switchable period-2 oscillations for speed control (see text for details).

to enforce the development of minimal neural structures (with respect to the number of hidden neurons and synapses) [10]. To understand the origins of behaviorally relevant dynamics, it is important to clarify the contribution of minimal recurrent neural networks. In some occasions, it is possible to directly derive behavior relevant dynamical properties from the structure and parameters of the RNN. But mostly, it is almost impossible to also include the dynamics of robot-environment interactions in order to explain the observed behavior in detail. In the following we do not provide further details with respect to the parameter settings of the evolutionary processes. Our focus lies exclusively on the dynamical properties of specific control structures, chosen by us as the best examples, demonstrating the essential mechanisms of frequently observed phenomena. Behavior control by frequency modulation. It is well known from analytical investigations that over-critical negative self-connections of single neurons can generate switchable period-2 oscillations [11]. Analyses of the RNN in Fig. 1 have shown that the behavior control is actually provided by such switchable oscillators. This controller solves a light seeking task. The diagrams in Fig. 1 (right) show two examples where a period-2 oscillation is modulating the behavior of a Khepera robot. The upper diagram shows the on-off switch of period-2 oscillations of output neuron O2. The switching is determined by the left proximity sensor (given by I1), that is, a switch-on causes a turn to the right. A period-2 oscillation is also used in front of a light source to generate a stop (Fig. 1, right bottom). In this situation the oscillation is determined by the increased activation of the frontal light sensor (I4). In contrast to the turning, both output neurons are synchronously oscillating with period-2 (not shown). This controller demonstrates, that oscillating output signals can be used for behavior control since the body of the robot operates as an integrator. According to the inertia of the robot’s body, effective motor actions result from the mean network output signals. If the robot is standing in front of a light source, both

C

B I2

−1 .8

5

.8 −4 O1 0.94

1

I3

0

I1

Output

A

0

I1

−1 0

I3 0.5

O1 1

1.5

2

2.5

13.5

14

14.5

15

15.5

16

Time [sec]

Fig. 2. A: The robot micro.eve. B: RNN of one arm (f (x) = tanh(x)). C: Neuron output (see text for details).

outputs are permanently oscillating between 1 and 0. Hence, the mean over time is 0.5. Due to the applied post-processing an effective motor signal of 0.5 represents a motor speed of zero. Such effects are not superficial results of artificial evolution. Morris and Hooper [12] demonstrated that in biological systems slow muscle contractions are coded by the average amplitudes of fast rhythmic neural activities. Induced oscillations. Fig. 2 shows an example where motor oscillations are induced through the environmental loop. The ring-shaped robot micro.eve (Fig. 2A) is placed on two passive rollers on which it can rotate around its body center by moving the five independent arms in order to translate the overall center of mass in a coordinated way. For further details about the robot and different control strategies see [13]. Here, the presented RNN (Fig. 2B) is one out of five completely autonomous networks which independently control one of the five arms. Because of information provided by the hall sensor I1 and the gyroscope I2 (both part of the ring) every RNN gets information about the movement of the common body. Therefore, single controllers can “sense” the resulting effects of the other controllers’ behavior. I3 gives information about the current motor position of the controlled arm. The output neuron signal O1 represents the motor command for the servo motor. As one can see in Fig. 2C the signal is oscillating since the hall sensory input remains zero at the beginning where the robot has to initiate its own rotational movement. These oscillations can not be deduced from the dynamical properties of the network, because there are no recurrent connections which can provoke oscillations. Instead, they are caused by the loop through the environment (dashed line in Fig. 2B), which can be described as a reflex oscillator. The output of O1 is sent to the servo motor, and due to the motor’s inertia and friction the desired position is reached with a certain delay. The current position of the motor is fed back to the network through I3 which has a strong negative connection to O1. Therefore, O1 produces signals inverse to the current motor position, and this causes the observed oscillations. These oscillations are of utmost importance for initiating a rotation of the ring at all [13]. As soon as the ring starts to rotate, the hall sensor becomes active and due to the much

Joint1

Joint2

Joint3

Foot

I1

I2

I3

I4

1.1

Output O2

0 1.3 1.42 O1

A

O2

−1.0

1

4.76



♦♦ ♦ ♦ ♦

slow ♦



0.0

. ♦ ♦ ♦

O3 −1.1 −1.1

B



♦ ♦

♦ ♦

♦ ♦ ♦

♦ ♦ ♦ ♦ ♦♦♦ ♦♦ ♦

normal

♦ ♦

0.0 Input O2

1.1

Fig. 3. A: Neural single-leg controller (f (x) = tanh(x)). B: Hysteresis of neuron O2 (see text for details).

stronger connection from this sensory input (I1) the aforementioned oscillations are suppressed depending on the signal strength of I1 (see Fig. 2C). Summarizing these two examples, we can say that in both cases fast oscillations provide important behavior relevant dynamics. However, they differ in their origin. In the first case the oscillations result from the neurodynamics of the RNN. In the second experiment these oscillations emerge from the ongoing robot-environment interactions. Neural hysteresis in reflex-walking-control. In this section the role of a hysteresis element in a simple neural reflex-oscillator for single-leg (3DOF) control of walking machines is demonstrated. The controller shown in Fig. 3A is one of the simplest and yet one of the most effective controllers found during evolution experiments for the task of forward walking [14]. How does this structure, using only one sensory neuron (neuron I1, encoding angular position of joint 1), three motor neurons (neurons O1, 2, 3, specifying the desired angles to the servo motors), and four synapses, produce a coordinated walking pattern of a 3DOF leg? All neurons used for control are connected in a loop (I1 − O2 − O3 − O1 − I1) which passes through the environment from neuron O1 to neuron I1 (dashed line). This sensorimotor loop results in a nonlinear transformation which can be approximated as a negative feedback with a time delay, resulting in a slow oscillatory movement (compare to the aforementioned description of a reflexoscillator for the micro.eve robot). During the oscillatory movement it was found that the motor-neurons approximately act as bistable elements. The bistability can be explained by the property of neuron O2. Neuron O2 plays a major role in the controller network. It is the first neuron in a chain which directly couples all motor neurons. The following motor neurons therefore have either the same phase or a phase shifted by 180 degrees (neuron O3 in phase, neuron O1 in antiphase) when compared to neuron O2. Neuron O2 has a self-connection larger than 1.0 which makes it a hysteresis element [11]. In Fig. 3B the output of neuron O2 is plotted against its input under actual walking conditions (outer curve). The plot shows two effects of the hysteresis element: First, the bistability may be explained by two stable fixed points of the hysteresis domain (≈ {−1, 1}). Second, the hysteresis element may be approximated as a time delay which adds to that of the environmental

C

I1 1.0 −0.1 5

2.4

O1

D

Output O1 Output I1

A

1 0

−1 1 0 −1 0

Output O1

B

3000

Time [steps] 6000

1

1

300 Hz

9000 1

600 Hz

0

0

0

−1 −1

−1 1 −1

−1 1 −1

0

Input O1

0

Input O1

1000 Hz

0

1

Input O1

Fig. 4. A: A RNN (f (x) = tanh(x)) realizing low-pass filtering at ≈ 300 Hz. B,C: Input signal at increasing frequency (from 100 Hz to 1 kHz, 48 kHz sampling rate) and the corresponding output signal. D: The hysteresis effects between input and output signals at certain frequencies.

loop, therefore contributing to the slow and smooth oscillating walking movement. Finally, it may be noted that the transient is modulated by the frequency of the input signal. Under extremely slow (theoretical) walking conditions (inner curve of neuron O2 input/output plot) the transient approaches the hysteresis of the system, and therefore becomes much narrower than during actual walking conditions. Neural processing of auditory signals. Inspired by evolved robot control we deduced a neural structure that realizes a simple hysteresis element (called dynamical neural Schmitt trigger, Fig. 4A). The structure has three parameters that define the width and the shift of the hysteresis domain [15]. For applications it is usually assumed that input signals vary only slowly with respect to the network update. But how do the dynamical properties of the neural Schmitt trigger change when the input values change on arbitrary time scales? Fig. 4 shows an example where a time series of a continuously increasing frequency is fed into a RNN. At a certain frequency the output remains in the lower saturation domain of the output neuron. Hence, one may argue that hysteresis elements behave as a low-pass filter [16]. We have successfully adapted such a structure to filter background noise of a walking machine and even to recognize low-frequency sounds (i.e., 200 Hz) to perform a sound tropism [16]. These applications demonstrate how a sensory driven dynamical system becomes sensitive to the frequency of the input signal. The effect of filtering high-frequency signals itself can be explained by the shift of the hysteresis domain and the transients of the system. The self-connection determines how fast (i.e., needed number of time steps) the neuron activation ends near the fixed point. For the isolated system we have stable fixed points in the lower and upper saturation domain (i.e ≈ {−1, 1}). When the input signal is continuously changing, the fixed points vary only slightly and if the amplitude is large enough one observes the characteristic jumps at the end of the hysteresis domain. However, when a high frequency input signal is applied, because of the

I5

−60

60

I1

−40

I4 I3

−5

I2

0

25

25 −20 O1 −20

−5

−5

−12 0

0

H1 I6

I6

O2 −5

12

1.0 Output

−5

0

4.25

I4

0.0 2000

Time [steps]

4000

−34

Fig. 5. The recurrent neural network producing a motivational driven robot behavior and the resulting behavior in simulation. The diagram shows the level of energy (I6, black) and activation of the frontal light sensor (I4, grey) during the interaction (f (x) = σ(x) = 1/(1 + e−x )).

slowness of transient dynamics these fixed points are never approached and at a certain frequency orbits may stay near one or the other fixed point. Due to the slowness of the transient dynamics these fixed points are never reached and at a certain frequency the orbits may stay near one or the other fixed point if a high frequency input signal is applied. Our presented system has a cut-frequency of ≈ 300Hz (compare Fig. 4C). Here, the upper saturation domain will never be reached as a consequence of these slow transients and the bias term. Thus, high-frequency oscillations are suppressed, and therefore, the system acts as a low-pass filter. Reflex-walking-control and the low-pass filter are both based on bistable elements. The specific control signals, however, are determined by the frequency of the input signals modulating the transients of these hysteresis elements. Both examples, therefore, indicate how one and the same element can act in different ways due to its modulation by the sensorimotor loop. Rhythmic behavior switching For the study of behavior switches provided by complex neural dynamics we evolved a RNN to develop a motivational driven robot behavior. We call a robot behavior motivational driven, if the neural control is not only determined by current sensor states of external stimuli but also by an internal level of energy. As a first simple example for such a motivational driven behavior we used again the Khepera robot and extended a reactive light seeking module (by structure evolution) to a control structure which maintains a certain level of energy while the robot accomplishes an exploration behavior. A resulting network is shown in Fig. 5. As one can see, the already introduced input-output-structure of the reactive light seeking module (see Fig. 1) is extended by one input neuron I6. This neuron indicates the current level of the simulated energy reservoir, which is defined as follows: I6(t + 1) := I6(t) + c1 · I4(t) − c2 , c1 , c2 > 0. The constant loss of energy can only be compensated by standing in front of a light source (i.e., by high activations of the frontal light sensor I4).

−w5

w1

w6 I1

1

−w1

H4

w1

−w1

Output

−w

2

H5

4

w3 H1

−w

w4

w3 H3 w2

01

H2

H5

0 0

w3

I1

0 1

w3

5000

10000 Time [steps]

15000

20000

Fig. 6. An internal rhythm generator (f (x) = σ(x) = 1/(1 + e−x )), and how it can be influenced by sensory stimuli.

The resulting robot behavior in simulation is also shown in Fig. 5. One can see that the robot switches between exploring the environment and standing close to a light source. The diagram in Fig. 5 indicates that the behavior switches are determined by the level of energy. At a certain intensity of I6 (≈ 0.8) the robot is leaving the light source. Further on, the output I6 is characterized by slow oscillations. But, these slow oscillations are determined by the properties of the energy reservoir (i.e., c1,2 in the equation above) and by the robot-environment interaction. Notice, that this issue leads us to a cyclic causality: on the one hand, I6 is determining the behavior switches, and, on the other hand, I6 is determined by the robot-environment interaction. The slow oscillations emerge from the sensorimotor loop. Synchronized rhythms Fig. 6 (left) shows an implementation of a neural rhythm generator. It is based on a two neuron loop, called SO(2)-network [17]. These networks with a special weight matrix generate quasi-periodic oscillations with a sine-shaped wave form. The period of these oscillations depends only on one parameter in the weight matrix. The coupling of two identical SO(2)-networks can realize stable oscillations with very large periods [18]. There, a concrete implementation of the rhythm generator is used to coordinate competing behaviors in groups of up to 150 robots. Each robot is equipped with its own internal rhythm, that is, each robot has a slightly different frequency, which is reminiscent of circadian rhythms found in animals [19]. This rhythm determines whether the robot searches and collects food in the environment or returns to a home area where the collected energy is transfered to the common nest of the group. To maximize the energy level of the nest, it turned out that a coordination of the single behaviors is of great advantage, because the interferences resulting from the interactions of up to 150 robots in a shared environment lead to tremendous mutual obstructions [18]. To achieve a coordinated foraging and homing behavior within the whole group the single rhythms have to become synchronized. In doing so, a robot needs the ability to communicate its internal state to other robots. One output neuron (O1 in Fig. 6, left) triggers a sound signal when it reaches a certain threshold. This neuron is coupled to the pattern generator in a way, that sound signaling occurs during the switch from zero to one of the output of neuron

H5 which amplifies the sine-shaped oscillations of the rhythm generator (Fig. 6, right). This signal can be perceived by nearby robots through the sensory input neuron I1. In turn, this perception provokes a phase reset as it can be seen for H5 in Fig. 6 (right). This mechanism allows behavior synchronization within a large robot group through minimal local communication. The resulting synchronized collective behavior is a result of local robot-robot and robot-environment interactions. The impact of slow varying inner rhythms for behavior control has in fact already been demonstrated for robotic applications (see [18]). However, the last two experiments provide minimal examples for the emergence as well as the synchronization of slow oscillations within the sensorimotor loop and for both cases the essential elements of the interplay between internal neural dynamics and external world can be clearly indicated.

3

Conclusions

In this paper we have presented six examples where the evolution of minimal recurrent neural networks for embodied agents explores the dynamics of robotenvironment interactions. We have seen how oscillations can be integrated by the body of a robot or even induced by the sensorimotor loop through the environment. Furthermore, in neural structures with equivalent dynamical properties transient effects resulting from robot-environment interactions are used for completely different tasks, such as the locomotion in walking machines and the filtering of auditory signals. Finally, through interactions with the environment internal rhythms determining differing behavior patterns can emerge in individuals or even become synchronized within large robot groups. Only by thoroughly analyzing evolved RNNs in the context of robot-environment interactions it was possible to reveal the interrelation between internal and external mechanisms underlying the evolved robot behavior. The dynamical systems approach to adaptive behavior is still at its beginning in the context of ER experiments (e.g., [20, 2]). And only very few studies involve thorough analyses of the evolved neural mechanisms (e.g., [20]) which can help to better understand the dynamical mechanisms underlying complex behavior and to clarify which behavioral aspects can be accounted to internal dynamics or to properties emerging from the sensorimotor loop. However, our approach does not only advance our understanding of these issues. It also enables us to construct highly efficient neural control systems by considering the sensorimotor loop to minimize the complexity required at the neurodynamics level. Our examples demonstrate that the evolution of minimal recurrent neural robot control enforces the development of simple networks (concerning their size, not their dynamics). This makes it possible to extract and set up basic neural structures together with their functions in a respective sensorimotor loop. Provided with such building blocks one then should be able to develop gradually more and more elaborated behavior control for autonomous robots with a richer sensomotoric equipment.

References 1. Pfeifer, R., Scheier, C.: Understanding Intelligence. MIT Press, Cambridge (2000) 2. Harvey, I., Di Paolo, E., Wood, R., Quinn, M., Tuci, E.: Evolutionary robotics: a new scientific tool for studying cognition. Artificial Life 11 (2005) 79 – 98 3. Nolfi, S., Floreano, D.: Evolutionary Robotics: The Biology, Intelligence, and Technology of Self-Organizing Machines. MIT Press, Cambridge (2000) 4. Clark, A.: Being there: putting brain, body and world together again. MIT Press (1997) 5. Nolfi, S.: Using emergent modularity to develop control systems for mobile robots. Adaptive Behavior 5 (1997) 343 – 363 6. Ziemke, T.: On ‘parts’ and ‘wholes’ of adaptive behavior: Functional modularity and dichronic structure in recurrent neural robot controllers. Meyer, J.A. et al. (Eds) Proc. of the 6th Int. Conf. on Simulation of Adaptive Behavior (2000) 115 – 124 7. Suzuki, M., Floreano, D., Di Paolo, E.: The contribution of active body movement to visiual development in evolutionary robots. Neural Networks 18 (2005) 656 – 665 8. Pfeifer, R., Gomez, G.: Interacting with the real world: design principles for intelligent systems. Artificial Life and Robotics 9 (2005) 1 – 6 9. Pasemann, F.: Complex dynamics and the structure of small neural networks. Network: Computation in Neural Systems 13 (2002) 195 – 216 10. H¨ ulse, M., Wischmann, S., Pasemann, F.: Structure and function of evolved neurocontrollers for autonomous robots. Connection Science 16 (2004) 249 – 266 11. Pasemann, F.: Dynamics of a single model neuron. International Journal of Bifurcation and Chaos 3 (1993) 271 – 278 12. Morris, L., Hooper, S.: Muscle response to changing neuronal input in the lobster (panulirus interruptus) stomatogastric system: Slow muscle properties can transfrom rhythmic input into tonic output. The Journal of Neuroscience 18 (1998) 3433 – 3442 13. Wischmann, S., H¨ ulse, M., Pasemann, F.: (Co)Evolution of (de)centralized neural control for a gravitationally driven machine. Capcarrere, M. et al. (Eds) Proc. of the 8th European Conf. on Artificial Life. LNAI 3630 (2005) 179 – 188 14. Twickel, A., Pasemann, F.: Evolved neural reflex-oscillators for walking machines. Mira J., Alvarez, J.R. (Eds) Proc. of the Work-Conf. on the Interplay between Natural and Artificial Computation. LNCS 3561 (2005) 376 – 385 15. H¨ ulse, M., Pasemann, F.: Dynamical neural schmitt trigger for robot control. Dorronsoro, J.R. (Ed.): ICANN 2002. LNCS 2415 (2002) 783 – 788 16. Manoonpong, P., Pasemann, F., Fischer, J., Roth, H.: Neural processing of auditory signals and modular neural control for sound tropism of walking machines. Int. Journal of Advanced Robotic Systems 2 (2005) 223 – 234 17. Pasemann, F., Hild, M., Zahedi, K.: SO(2)-networks as neural oscillators. Mira, J. and Alvarez, J. R. (Eds.), Computational Methods in Neural Modeling, Proc.: IWANN 2003. LNCS 2686 (2003) 144 – 151 18. Wischmann, S., H¨ ulse, M., Knabe, J., Pasemann, F.: Synchronization of internal neural rhythms in multi-robotic systems. Adaptive Behavior 14 (2006) 117 – 127 19. Winfree, A.: The Geomerty of Biological Time. Springer (1980) 20. Beer, R.: The dynamics of active categorical percetion in an evolved model agent. Adaptive Behavior 11 (2003) 209 – 244