Target Tracking Control of a Mobile Robot Using a Brain Limbic System Based Control Strategy
Changwon Kim and Reza Langari
IROS 2009
Contents n Previous research on Brain Emotional Learning n Introduction to the BELBIC n Mobile robot model with BELBIC n BELBIC target tracking model n BELBIC target tracking model with fuzzy clustering method n Conclusions n Future works IROS 2009
Previous BEL Research n Cognition Science - Mowrer(1960): Two process model of learning - Rolls(1986): The mechanism of emotion / application to the neural basis emotion - LeDoux(1995): Function of amygdala in emotional process - Balkenius & Moren(2001): Development of brain emotional learning computational Model IROS 2009
Previous BELBIC Research n Engineering - Lucas et al.(2004): Introduced BEL based controller to engineering - Mehrabian and Lucas(2005): Designed a robust adaptive controller via BELBIC - Chandra and Langari(2006): Analyzed the BEL based approach by using methods of nonlinear system theory - Shahmirzadi et al.(2006): Compared BEL with Sliding mode control - Lucas et al. (2006): Applied BELBIC to washing machine - Sheikholeslami et al.(2006): Applied BELBIC to HVAC system - Mehrabian et al.(2006): Applied BELBIC to Aerospace launching machine - Rouhani et al.(2007): Applied BELBIC to micro-heat exchanger - Jafarzadeh et al.(2008): Applied BELBIC to path tracking
IROS 2009
Brain Limbic System n Amygdala - communicate with other cortices in limbic system - association between a stimulus and its emotional consequence - assigning a primary emotional value to each stimulus
n Orbitofrontal cortex - operates based on the difference between the perceived reward and the actual received reward
n Thalamus - initiating the process of a response to stimuli, send signal amygdala and sensory cortex
n Sensory cortex - distributing the incoming signals appropriately
IROS 2009
Cerebral Cortex < http://www.morphonix.com/ >
BELBIC (Brain Emotional Learning Based Intelligent Controller)
Learning rules
⎛ ⎞ ΔGAi = α ⋅ SI i ⋅ max ⎜ 0, Rew − ∑ Ai ⎟ i ⎝ ⎠ ⎛ ⎞ ΔGOCi = β ⋅ SI i ⋅ ⎜ ∑ Ai − ∑ OCi − Rew ⎟ i ⎝ i ⎠
Sensed Information
Thalamus
Sensory Cortex
SI
OFC OFC -
+
Internal signals/ Model Output
Ai = GAi ⋅ SIi
OCi = GOCi ⋅ SIi
Rew
MO = ∑ Ai −∑ OCi i
i
IROS 2009
Amygdala Amygdala
MO
Mobile Robot Control Strategy Y
Mobile robot & control inputs
x&= v cos θ
( xt ,
v = δu p ω = εφ
y&= v sin θ θ = ω
dy
v
SI and Reward
dx
( x, y )
SI = ( xt − x ) + ( yt − y ) Rew = γ SI + δu p 2
yt )
2
OR
X Components of Rewards
0.9 Gam*SI Del*Up
0.8
Reward
uP = SI × MO
0.7 0.6 0.5
Learning rules G&A = α max {0, γ + (δ SI − 1) GA − δ GOC SI } SI 2
G&OC = β { (1 − δ SI ) GA + (δ SI − 1) GOC − γ } SI
0.4 0.3
2
0.2 0.1 0
IROS 2009
0
10
20
30
40
50
60
70
80
90
100
BELBIC Target Tracking Model § Target Generator - Multi targets problem
+
§ Error Analysis - assigning a new target
+ Rew
- making SI
Target Genarator
§ BELBIC
+
- robot velocity command from SI and Reward
IROS 2009
BELBIC
SI
Distance / Angle error
+ -
Angular Vel
u
X
v
x, y Robot
w
BELBIC Target Tracking Model Robot Trajectory x and y
Robot Direction
12
60
10
50 x y
40
Angle[Deg]
Position[m]
8
6
20
4
2
0
0
10
10
20
30
40
50 60 Time[sec]
70
80
90
100
0
0
10
20
GA and Goc
-3
6
30
x 10
30
40
50 60 Time[sec]
70
80
90
100
Robot Trajectories(Multi Targets): x vs. y 15 GA
4
Goc
10
A
Y position[m]
Gain
2
0
5
B 0
-2 D -5
-4
-6
0
10
20
30
40
50 60 Time[sec]
70
80
90
100
-10
C
0
IROS 2009
5
10
15 X position[m]
20
25
30
BELBIC Target Tracking Model (fuzzy clustering) n larger error needs larger robot velocity n a decelerated faster than b: deg of Cd1 n d accelerated faster than c: deg of Cd 2
vmax
vH
SI and Reward
Desired robot velocity
v a
b
SI = ( xt − x ) + ( yt − y ) Re w = µ1Cd1 + µ2Cd2 + δ u p 2
2
uP = SI × MO
Learning rules
{
vL
} }
G A = α max 0, µ1Cd1 + µ 2Cd 2 + δ u p − G ASI SI G OC = β G A SI − GOC SI − µ1Cd1 − µ 2 Cd 2 − δ u p SI
{
IROS 2009
c
e a eb
ec
d
ed
emax
e
BELBIC Target Tracking Model (fuzzy clustering)
Fuzzy clustering rules If error is 0 and velocity is 0, then Cd1 is 0 and Cd2 is 0. If error is 0 and velocity is 0.1, then Cd1 is 0.1 and CD2 is 0. If error is 0 and velocity is 0.2, then Cd1 is 0.2 and Cd2 is 0. … If error is 1 and velocity is 0.9, then Cd1 is 0 and Cd2 is 1. If error is 1 and velocity is 1, then Cd1 is 0 and Cd2 is 0.
Membership Function: Error
Membership Function: Velocity
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
0.2
0.4
0.6
0.8
1
0
0
0.2
Membership Function: Cd1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0.2
0.4
IROS 2009
0.6
0.6
0.8
1
0.8
1
Membership Function: Cd2
1
0
0.4
0.8
1
0
0
0.2
0.4
0.6
BELBIC Target Tracking Model (fuzzy clustering) § Target Generator - Multi targets problem § Error Analysis - assigning a new target - making SI
SI Target Genarator
§ Fuzzy Clustering - clustering according to the error and velocity
+
Distance / Angle
Fuzzy Clustering
Rew
BELBIC
u
X
v
Robot
error
§ BELBIC - robot velocity command from SI and Reward
IROS 2009
+ -
x, y
Angular Vel
w
BELBIC Target Tracking Model (fuzzy clustering) Robot Direction
Robot Trajectory Target at x=10 and y=12
60
12
50
10
Position[m]
with with only only
Clustering x Clustering y x y
40
Angle[Deg]
BELBIC BELBIC BELBIC BELBIC
8
6
4
20
2
10
0
0 0
10
20
30
40
50 60 Time[sec]
70
80
90
100
GA and Goc
-3
8
x 10
0
10
20
30
40
50 60 Time[sec]
70
80
90
100
Robot Trajectories(Multi Targets): BELBIC with/without Clustering 15 BELBIC with Clustering ----- BELBIC only
6
A
10 BELBIC with Clustering GA
4 2
Gain
30
BELBIC with Clustering GOC BELBIC only GA
5
BELBIC only GOC
0
B
0 -5
D
-2
-15
-6 -8
C
-10
-4
0
10
20
30
40
50 60 Time[sec]
70
80
90
100
-20
0
IROS 2009
5
10
15
20
25
30
Conclusion n BELBIC mobile robot tracks the target successfully n BELBIC with fuzzy clustering method works for target tracking n BELBIC is a temporal learning method (each time the robot learns appropriate gains) n Development of a higher level learning method is needed to achieve autonomous mobile robot IROS 2009
Future Works High Level - Long- Term memory - Planning - Mapping
Multi-objective Decision Making (AHP)
Low Level
- Target tracking - Obstacle avoidance
OFC
High Level
Amygdala
Low Level
Development of mobile robot navigation structure for Open World Model: Multi-objective Decision Making Method: Analytical Hierarchy Process (AHP) IROS 2009