A User-Oriented Framework for the Design and ... - Semantic Scholar

Report 0 Downloads 103 Views
A User-Oriented Framework for the Design and Implementation of Pet Robots* Ming-Hsiang Su1 Wei-Po Lee2 Dept. of Management Information Systems National Pingtung Univ. of Science & Technology Pingtung, Taiwan 1

Abstract - In recent years, application of intelligent autonomous robots for home amusement has become an important research criterion, and pet robots have been designed to become the electronic toys for the next generation. To develop pet robots that can act in real time in the real world, this work adopts the behavior-based control architecture. In our control framework, an imitation-based learning system is included to build robot behaviors. Moreover an emotional model is embedded to the control architecture. By giving the pet robot an emotional model, it can explicitly express its internal conditions through its various external behaviors, as the real living creature does. To evaluate the proposed framework, we have developed an interactive environment and successfully used it to design a pet robot. Keywords: pet robot, emotion-based control, behavior system, robot entertainment.

1

Introduction

In recent years, designing robots for home amusement has become an important application of intelligent autonomous robot research, and pet robots have been implemented to be the electronic toys for the next generation [4]. Though building fully autonomous artificial creatures with human-like intelligence is a long- term goal and it is not yet achieved at the current stage, it is now possible to create the embodied prototype of artificial living toys acting in the real world, with the current technology in computing and electronics and knowledge in ethology, neuroscience and cognition. To provide an environment for developing pet robots, this paper presents an interactive framework with which the user (the owner of the pet robot) can conveniently design his personal robot according to his preferences. From the engineering point of view, developing behavior-based control systems that decompose the competence of an agent into multiple task-achieving control modules has now become a serious alternative to the traditional approach in robot design. This approach has been used to construct many real robots acting in real time in the real world and the design concept has been confirmed *

0-7803-8566-7/04/$20.00 2004 IEEE.

Jian-Hsin Wang1 Department of Information Management National University of Kaohsiung Kaohsiung, Taiwan

2

to fulfill the important principles of cognitive science [9]. Therefore, our work adopts the behavior-based architecture as the control framework of pet robots. Yet, constructing robots with life-like characteristics, we have to take the biological knowledge into account and further derive the design principles from biological experiences [1]. To behave as a living creature and to emerge certain level intelligence, a digital pet needs some abilities to interact with other entities within the environment it is situated in. Emotions are the most important drives to achieve this [5][8][10]. In fact, the famous neurophysiologist, Antonio Damasio has suggested that efficient decision-making largely depends on the emotions underlying mechanism, and his research has shown that even in simple decision-making processes, emotions are vital in reaching appropriate results [3]. Also the experimental discoveries have shown that emotions can be mediated by the interactions between neural systems involving the amygdala, the hippocampus, and the prefrontal cortices [3][6]. Our control architecture includes an emotion system to take advantages of cognitive processes to solve the behavior selection problem within the behavior-based control system. In this work, emotions are defined as a product of internal state and they are communicated to the external with the changes of external behavior. By giving the pet robot an emotional model, the pet robot can explicitly express its internal conditions through the external behaviors, as the real living creature does. The owner can thus understand the need and the status of his pet robot and then has further interactions with it. Definitions of emotions differ in scopes, depending on the specific research purposes. Different definitions for various aspects of emotions have been proposed, and our work takes the definitions from behavior and functional point of views to develop an emotion model for pet control. The behavioral definition is to define emotion in terms of behavior rather than of particular brain structures. This definition focuses on whether emotional behaviors can be seen rather than on the physical structure that mediates the behaviors. In this definition, emotions can be inferred from actions. The functional definition has argued that an artificial entity (such as a robot) would need emotions to drive and manage the behaviors it can exhibit so that a harmonious and sequential behavior can be observed.

Functional definitions are similar to the behavioral ones in that they both stress the observable actions (rather than the biological structures they have). They differ in that functional definitions do not require these actions to actually happen, but focus on the processes that include the method and the mechanism to manage the actions. To evaluate the proposed framework, we have developed an interactive environment by which a user can easily choose a set of formulas to specify the type and the characteristics of his pet. Different sets of experiments have been conducted for learning new behaviors and learning how to coordinate different behaviors for a given emotion model. The results and analysis show that an emotion-based personal pet robot can be design and implemented successfully by our framework.

2 2.1

A Framework for Constructing Pet Robots Control Architecture

The most important issue in constructing useroriented pet robots is to develop the control architecture embedded in the robots. In this work, the control architecture of a robot mainly includes four parts: a perception sub-system that receives and processes stimulus from the environment a robot is situated in, a behavior subsystem that contains reactive behavior controllers, an emotion sub-system that contains mathematical models of emotional states of the robot for behavior selection, and a body sub-system that describes the internal states of a robot. Figure 1 shows the overall control architecture of a pet robot. As in the behavior-based control paradigm, the behavior system in our work takes the structure of parallel task-achieving computational modules to resolve the control problem. Depending on the user’ s requirements (e.g., the type and characteristics of his digital pet), different behaviors can be designed for the pet robot. The behavior system in Figure 1 represents the set of user-defined behaviors in which each behavior controller is a feedforward neural network. In order to achieve more complex behaviors in a coherent manner, the behavior modules developed have to be well-organized and integrated. Inspired by the ethological models that were originally proposed to interpret the behavior motivations of animals (e.g., [2][12]), robotists have developed two types of architectures: flat and hierarchical ones. The former arranges the overall control system into two levels; and the latter, multiple levels. In the flat type arrangement, the system designer considers the target control task as the interaction of a set of low-level subtasks related to the target task and relatively simple to solve. All the subtasks are independent and have their own goals. Each of them will be achieved by a separate behavior controller, which generates its own output according to the sensory input directly relevant to the subtask itself. Those

independent outputs from the separate controllers will be combined appropriately in order to generate the final output for the overall task. The other type of architecture organizes the overall control system in a hierarchical way. It considers the target control task to be achieved in a recursive manner: after a control module is assembled from a set of lowerlevel controllers, it can be used as a building block to construct other controllers. Though the above two ways organize their control structure differently, both of them need to deal with the problem of coordination. This is to specify a way to combine the various outputs from the involved controllers in a coherent manner. There are two ways to implement behavior arbitrators, namely command fusion and command switching. In the first way, the arbitrator is a certain function of the outputs from the involved behavior controller (the weighted sum is the most popular function); it makes all of the participated behavior controllers able to contribute to the final outputs simultaneously. The second way, command switching, operates in a winner-take-all fashion; only one of the output commands from the involved behavior controllers is chosen to take over the control at any given moment, according to certain sensory stimuli. The mechanism supporting the arbitration of switching can be regarded as a multiplexer, and can be implemented by a simple reactive whose output is used to trigger one of the outputs from the involved behavior controllers, according to the changes of external environment conditions and internal system states. As this work means to provide an interactive and easy-to-use framework for the design and implementation of digital pets, the straightforward approach — flat architecture, is thus used as the default control structure for a pet, and the method of command switching is used to select a behavior controller to take over the control at a specific situation. To behave like a living creature and to emerge certain level intelligence, a digital pet needs some abilities to interact with other entities within the environment it is situated in. Emotions are the most important drives to achieve this. By equipping the pet robot an emotional model, the robot can explicitly express its internal conditions through the external behaviors to communicate with its owner, and on the other hand the owner can thus understand the needs and the current status of his pet and then interact with it properly. In fact, in the study of neurophysiology, Antonio Damasio has suggested that efficient decision-making largely depends on the emotions underlying mechanism, and his research has shown that even in simple decision-making processes, emotions are vital in reaching appropriate results [3]. Therefore, based on the discoveries in neurophysiology, our control architecture includes an emotion system for behavior selection (i.e., working as the arbitration mechanism mentioned above) and the detailed model is described in a section below. In addition to the above two sub-systems, a set of internal variables (such as hungry, agile, and familiar) are defined to describe a pet’ s body state. These variables

represent the basic requirements the pet has to be satisfied, and they must be regulated and maintained in certain ranges. In our framework, the range of each internal variable can be defined by the user, and the corresponding specification determines the innate characteristics of his pet. Here, an internal variable is described by a mathematical formula of involved objects (which are also defined by the user), and it can be expressed as Si(t) = Si(t1)+ f(O). In the above expression, Si(t) is a numerical value describing the body state Si at time t; f is a function describing how an internal variable Si can be affected by a set of object O (defined by the user), and it can be chosen from some default candidates (such as linear, quadratic, exponential functions). As can be seen in Figure 1, when a pet robot performs certain behavior to product some actions, its body state will be changed and that will further affect the emotion of the pet. The emotions are then used to make decision for selecting appropriate behavior in response to a new situation.

Emotion Subsystem

Perception Subsystem

Body Subsystem

Behavior Subsystem

Env.

Figure 1. The overall control architecture of a pet robot.

2.2

The Emotion Model

Emotions can be categorized into two kinds: basic emotions and higher cognitive emotions. Basic emotions (such as joy, anger, fear, etc.) are thought as universal and innate; they are encoded in the genome, the legacy of the evolutionary history. Higher cognitive emotions (such as love, shame, pride, etc.) are universal too, but they exhibit more cultural variations. Unlike basic emotions that are rapidly built up and only last for a few seconds, higher cognitive emotions take longer time to generate and vanish. Basic emotions tend to be reflex-like responses over which animals have little conscious control. They involve less cognitive processes and are thus much faster in controlling motion than those culturally determined experiences (rules) residing in the cognitive system (long term memory). As our goal here is to establish a framework for constructing personal pet, rather than to investigate the interactions and relationships between cognitive and emotion systems, our work only models simple emotions to coordinate the behave controllers the user has pre-chosen for his pet. The simple emotions in this work includes “ happy” , “ angry” ,“ fear” ,“ bored” , and “ shock” , which are the most common basic emotions agreed (thought different names could be used) by the different researchers. To quantify the

emotions mathematically, this work defines each of the above basic emotion as Ei(t) = Ei(t-1) + g1(S) + g2(P), in which Ei(t) represents the quantity of emotion Ei at any time t; g1 and g2 are two functions (also selected by the user) de s c r i bi n gh ow t h e pe t ’ s body s t a t e S and perception information P affect a certain emotion. The set of five equations for the basic emotions then consist of the emotion model of a pet robot. As the internal variables described above, the emotion model determines a pet’ s innate characteristics strongly related to the pet’ s abilities of learning and adaptation. With the above model, at any given time the emotions of the pet can be derive and used to trigger the behavior controller to generate a specific action. After the action is performed, the pet’ s internal body state and the external environment conditions may thus change and which could further affect the emotion of the pet. The pet then makes new decision for behavior selection, based on the modified quantities of emotions. For example, a pet dog in the hungry state may be angry and would like to look for food and eat as soon as it finds any. After that, the dog may be not hungry any more (the body state is changed) and then it is happy (the emotion is changed) and wants to sleep or fool around (new behavior is selected). The above procedure is repeated and the emotion model continuously works as a decision-making mechanism for behavior selection.

2.3

Learning Simple Behaviors by Imitation

As is described, the behavior system shown in Figure 1 contains a set of behavior controllers and the user can choose some of them as the simple behaviors his digital pet can perform. Our framework includes a sub-system (as shown in Figure 2) that allows a user to build behavior controllers for his personal pet via an imitation-based learning mechanism. This subsystem includes three modules. The first is a simulated environment in which the user can drive his pet robot to perform the goal behavior (i.e., to achieve the target task). In this stage, the robot is regarded as a teacher that shows how to perform the behavior. During the demonstration, at each time step the perception information received from the robot’ s sensors and the motor commands executed in response to the perception information are recorded as an instruction in achieving the task. All the instructions collected are then combined as a training data set for later learning. In other words, it is to derive the instruction set (i.e., the sensormotor mapping table) from the behavior demonstrated by the teacher robot. In this approach the task the robot is expected to achieve has to be firstly transferred from a qualitative behavior into a set of quantitative data. Our work takes a sensor-motor mapping for a given time step t as an instruction vector <s1, s2, …., sn, a1, a2, …, ak> in which si is a normalized perception stimulus for sensor i, ai is a normalized and transferred action value, and n, k are the numbers of sensors and actuators considered as related to the target behavior respectively.

The second module includes an imitation mechanism that learns the target behavior from the data collected in the above procedure. In this work, a neural network is adopted as the behavior controller because it has the characteristics of biological compatibility and noise resistance. The network taken as a behavior controller in this work is a twolayer feed-forward neural network. The first layer is the input layer consisting of input units corresponding to the sensors mounted on the robot, and the second, the output layer connected to the actuators. With the above neural network architecture and the characteristics of sensors and motors, we can then employ the back-propagation learning algorithm to learn a set of appropriate connection weights and neuron thresholds for robot control, from the demonstration examples shown by the teacher robot. By adjusting the network connections to generate appropriate motor commands to re-produce the desired behavior, the robot with this controller is in the sense imitating the teacher’ s behavior. When a controller is evaluated, at each time step the sensor information <s1, s2, …., sn> in the mapping table is sent to it, and the actual output of this controller (interpreted as the actuator command) is expected to be the same as the one correspondingly recorded in the sensor-motor map (i.e., <a1, a2, …, ak>). Therefore, the error for each time step can be defined as the differences between the expected and the actual outputs. To cope with different environment situations, a teacher robot normally shows the target behavior a few times so that a reliable and robust behavior controller can be obtained. All instructions about the sensor-motor mapping from different behavior trials are collected and arranged in a single training set. By minimizing the accumulated action error over the entire training data set, the robot can improve its behavior and finally achieve the task. iro: 0.5403 ir1: 0.3319 : ir7: 0.0034 l0 : 0.1039 l1 : 0.0331 : l7 : 0.2348

Trial: 2 Time Step: 154

ANN Learning Mechanism

The imitation procedure is evoked again to re-learn a new controller.

3

Implementation and Results

To evaluate the proposed framework, we have conducted different sets of experiments. Due to space limitation, in the following sections we only describe the experiments for teaching a pet to perform separate simple behaviors, and the experiment for training a pet to select correct behavior expected by the owner.

3.1

Behavior Teaching

In the experiments of behavior teaching, we used one robot to act as the teacher and the learner. The first behavior controller to learn is the one that can fulfill the task of obstacle avoidance. For this task, the robot was restricted to use the eight infrared sensors to detect objects around it and then decide how to move. A two-layer neural network (one for sensor input, and one for motor output) was defined and only the infrared sensor signals were used as the input. The simulated robot was driven manually to perform the obstacle avoidance task perfectly, and the information was recorded to train the learner robot. Figure 3 shows the environment developed in this work in which the user can operate the robot to demonstrate the target behavior, by which the instruction set can be obtained. With the quantitative instruction set, a supervised learning algorithm can thus be employed.

user operating

m0: +11 m1: +10

user editing

sensor-motor maps

Figure 2. The sub-system for learning robot behaviors. The third module involves a re-learning procedure. That is, if the learner robot can not produce a similar behavior as the teacher robot does after learning, the user can then modify the training set by driving the teacher robot to repeat the sub-behaviors that the learner robot failed to achieve and adding the patterns newly obtained to the set.

Figure 3. The environment for operating the robot manually to collect the sensor-motor information in achieving the target task. In order to obtain a reliable and robust controller, in our experiments the teacher robot demonstrated the behaviors several times and each time the robot started from a different position to gather information distributed over different regions of the given environment. Figure 4(a) shows the typical trajectory produced. After the training phase, the learner robot was able to produce very similar behavior as the teacher did, and the behavior is presented in

Figure 4(b). It shows that the learner robot has successfully imitated the teacher’ s behavior. To learn more complicated behaviors, we use it to develop a neural controller for approaching a light source while avoiding the obstacles in the environment. To accomplish this task, a successful controller has to negotiate signals from two kinds of sensors; that is, to solve the sensor fusion problem. Figure 5(a) is the original target behavior presented by the teacher robot: the robot turned toward the light source and moved as straight as possible when it did not sense any obstacles. The overall control strategy is that whenever the robot detected obstacles around, it turned away from the obstacles as it did in the experiments described above, and then kept moving toward the light source. Figure 5(b) is the behavior shown by the learner robot after the imitation. From these figures, we can observe that the behaviors for the teacher and the learner are very similar. The two trajectories produced also illustrate

network, and the output of this network is used to determine which behavior to perform at a certain time. Ten basic behaviors are built here, including food-seeking, barking, wandering, licking, sleeping, escaping, etc. In the training phase, the user is allowed to give some training examples that specify which behavior the pet is expected to perform when the emotions reach the values he has assigned. Figure 6 shows the interface through which a user can edit the training set. Based on the emotion model defined by the user previously (i.e., choosing appropriate functions as described in section II) and the training set he has provided, the system then learns a control strategy with best approximation.

initial setting

that both controllers use the same strategies to avoid obstacles and approach the light source. It indicates that through the imitation-based framework, the example trials provided by the teacher robot have been learnt successfully.

testing

hungry

happy

wandering

agile

angry

barking

familiar

fear bored shock

learning cycle learning rate error rate

input training set

edit training set

input test set

edit test set

learning

input single training sample

edit single training sample

data correction query

Figure 4. (a) The typical behavior demonstrated by the teacher robot; (b) The behavior shown by the learner robot after learning.

Figure 5. (a) The example behavior demonstrated by the teacher robot; (b) The behavior shown by the learner robot.

3.2

Emotion-Based Behavior Selection

After showing that the individual behaviors for a pet can be constructed, we describe in this section how we used an emotion-based approach for behavior arbitration. In the experiments, the emotion variables described in the previous section are arranged as the input units of a neural

correction

Figure 6. The interface for editing training examples. Figure 7 is the environment presenting the typical experimental results. It shows the gesture and face expression of the pet during a test trial. As can be seen, the trajectory of the pet is also shown in which the circle indicates the visible range of the pet. In addition, the numerical values of the internal and emotion variables are illustrated to provide information about the body state and emotion of the pet. With the result provided, the user can i n s pe c tt h ede t a i l sr e l a t e dt oh i spe t .I ft h epe t ’ sbe h a v i or doe sn ots a t i s f yt h eown e r ’ se x pe c t a t i on ,t h eown e rc a n correct its behavior by editing the output produced from learned control strategy (through the environment shown in Figure 7). Then the modified output can be used as new training examples to derive a new strategy of behavior arbitration.

4

Conclusions

In this paper, we have indicated the importance of developing pet robots and described how we constructed a

user-oriented framework for the design and implementation of personal pet robot. In our prototype system, the control architecture of a robot mainly includes four parts: a perception sub-system that receives and processes stimulus from the environment a robot is situated in, a behavior subsystem that contains reactive behavior controllers, an emotion sub-system that contains mathematical models of emotional states of the robot for behavior selection, and a body sub-system that describes the internal states of a robot. The functions of different sub-systems are described and their relationships are analyzed. With such an interactive platform, a user can design his personal pet robot in an efficient manner. To verify this framework, we have conducted different sets of experiments to evaluate our approach. The preliminary results have shown the promise and potential of our system. After showing that our system is convenient for developing pet robots, we are currently extending our work for some research directions. The first is to include more behaviors and then evaluate the performance of the emotion-based mechanism for behavior arbitration. The other is to improve the virtual world so that the pet can act in a three dimensional environment. It is also worthwhile to investigate a way to bridge the gap between simulation and reality, and to develop embodied pet robots working in the real world.

Acknowledgement This work was partially supported by National Science Council under NSC91-2213-E-020-005

[1] R. C. Arkin, M. Fujita, T. Takagi, and R. Hasegawa. An Ethological and Emotional Basis for Human-Robot Interaction. Robotics and Autonomous Systems, 42(3/4): 191-201, 2003. [2] G. P. Baerands. A Model of the Functional Organization of Incubation Behavior. Behavior Supplement, 17, 263-312, 1970. [3] A. R. Damasio. De s c ar t e ’ sEr r or :Emot i on,Re as on,and Human Brain. Grosset/ Putnam Press, New York, 1994. [4] M. Fujita. Digital Creatures for Future Entertainment Robotics. Proceedings of IEEE International Conference on Robotics and Automation, pp.801-806, 2000. [5] D. Goleman. Emotional Intelligence. New York: Bantan Books, 1995. [6] J. E. LeDoux. The Emotional Brain. Simon & Schuster, New York, 1996. [7] M. Minsky. The Society of Mind. Simon & Schuster, New York, 1987. [8] K. Oatlley, P. N. Johnson-Laird. Towards a Cognitive Theory of Emotion. Cognition and Emotion, 1(1): 29-50, 1987. [9] R. Pfeifer, C. Scheier. Understanding Intelligence, MIT Press, MA, 1999. [10] P. Salovery, J. D. Mayer. Emotional Intelligence. Imagination, Cognition and Personality, 9(3): 185-211, 1990. [11] A. Sloman. Motives Mechanisms and Emotion. Cognition and Emotion, 1(3): 217-224, 1987. [12] N. Tinbergen. The Study of Instincts. Clarendon Press, 1951.

testing

initial setting

random

initial position add food

go

add surprise

go

run

References

remove food

go

go

pause

modifying

learning

calculating

wandering

current state behavior

wandering

emotion

happy

pet’ s expression

hungry

happy

agile

angry

familiar

fear bored shock

Figure 7. The platform for initial settings and showing the typical results.