IROS 2009(Kim&Langari)

Comment

Report 2 Downloads 59 Views

Target Tracking Control of a Mobile Robot Using a Brain Limbic System Based Control Strategy

Changwon Kim and Reza Langari

IROS 2009

Contents n Previous research on Brain Emotional Learning n Introduction to the BELBIC n Mobile robot model with BELBIC n BELBIC target tracking model n BELBIC target tracking model with fuzzy clustering method n Conclusions n Future works IROS 2009

Previous BEL Research n Cognition Science - Mowrer(1960): Two process model of learning - Rolls(1986): The mechanism of emotion / application to the neural basis emotion - LeDoux(1995): Function of amygdala in emotional process - Balkenius & Moren(2001): Development of brain emotional learning computational Model IROS 2009

Previous BELBIC Research n  Engineering - Lucas et al.(2004): Introduced BEL based controller to engineering - Mehrabian and Lucas(2005): Designed a robust adaptive controller via BELBIC - Chandra and Langari(2006): Analyzed the BEL based approach by using methods of nonlinear system theory - Shahmirzadi et al.(2006): Compared BEL with Sliding mode control - Lucas et al. (2006): Applied BELBIC to washing machine - Sheikholeslami et al.(2006): Applied BELBIC to HVAC system - Mehrabian et al.(2006): Applied BELBIC to Aerospace launching machine - Rouhani et al.(2007): Applied BELBIC to micro-heat exchanger - Jafarzadeh et al.(2008): Applied BELBIC to path tracking

IROS 2009

Brain Limbic System n  Amygdala - communicate with other cortices in limbic system - association between a stimulus and its emotional consequence - assigning a primary emotional value to each stimulus

n  Orbitofrontal cortex - operates based on the difference between the perceived reward and the actual received reward

n  Thalamus - initiating the process of a response to stimuli, send signal amygdala and sensory cortex

n  Sensory cortex - distributing the incoming signals appropriately

IROS 2009

Cerebral Cortex < http://www.morphonix.com/ >

BELBIC (Brain Emotional Learning Based Intelligent Controller)

Learning rules

⎛ ⎞ ΔGAi = α ⋅ SI i ⋅ max ⎜ 0, Rew − ∑ Ai ⎟ i ⎝ ⎠ ⎛ ⎞ ΔGOCi = β ⋅ SI i ⋅ ⎜ ∑ Ai − ∑ OCi − Rew ⎟ i ⎝ i ⎠

Sensed Information

Thalamus

Sensory Cortex

SI

OFC OFC -

+

Internal signals/ Model Output

Ai = GAi ⋅ SIi

OCi = GOCi ⋅ SIi

Rew

MO = ∑ Ai −∑ OCi i

i

IROS 2009

Amygdala Amygdala

MO

Mobile Robot Control Strategy Y

Mobile robot & control inputs

x&= v cos θ

( xt ,

v = δu p ω = εφ

y&= v sin θ θ = ω

dy

v

SI and Reward

dx

( x, y )

SI = ( xt − x ) + ( yt − y ) Rew = γ SI + δu p 2

yt )

2

OR

X Components of Rewards

0.9 Gam*SI Del*Up

0.8

Reward

uP = SI × MO

0.7 0.6 0.5

Learning rules G&A = α max {0, γ + (δ SI − 1) GA − δ GOC SI } SI 2

G&OC = β { (1 − δ SI ) GA + (δ SI − 1) GOC − γ } SI

0.4 0.3

2

0.2 0.1 0

IROS 2009

0

10

20

30

40

50

60

70

80

90

100

BELBIC Target Tracking Model §  Target Generator - Multi targets problem

+

§  Error Analysis - assigning a new target

+ Rew

- making SI

Target Genarator

§  BELBIC

+

- robot velocity command from SI and Reward

IROS 2009

BELBIC

SI

Distance / Angle error

+ -

Angular Vel

u

X

v

x, y Robot

w

BELBIC Target Tracking Model Robot Trajectory x and y

Robot Direction

12

60

10

50 x y

40

Angle[Deg]

Position[m]

8

6

20

4

2

0

0

10

10

20

30

40

50 60 Time[sec]

70

80

90

100

0

0

10

20

GA and Goc

-3

6

30

x 10

30

40

50 60 Time[sec]

70

80

90

100

Robot Trajectories(Multi Targets): x vs. y 15 GA

4

Goc

10

A

Y position[m]

Gain

2

0

5

B 0

-2 D -5

-4

-6

0

10

20

30

40

50 60 Time[sec]

70

80

90

100

-10

C

0

IROS 2009

5

10

15 X position[m]

20

25

30

BELBIC Target Tracking Model (fuzzy clustering) n  larger error needs larger robot velocity n  a decelerated faster than b: deg of Cd1 n  d accelerated faster than c: deg of Cd 2

vmax

vH

SI and Reward

Desired robot velocity

v a

b

SI = ( xt − x ) + ( yt − y ) Re w = µ1Cd1 + µ2Cd2 + δ u p 2

2

uP = SI × MO

Learning rules

{

vL

} }

G A = α max 0, µ1Cd1 + µ 2Cd 2 + δ u p − G ASI SI G OC = β G A SI − GOC SI − µ1Cd1 − µ 2 Cd 2 − δ u p SI

{

IROS 2009

c

e a eb

ec

d

ed

emax

e

BELBIC Target Tracking Model (fuzzy clustering)

Fuzzy clustering rules If error is 0 and velocity is 0, then Cd1 is 0 and Cd2 is 0. If error is 0 and velocity is 0.1, then Cd1 is 0.1 and CD2 is 0. If error is 0 and velocity is 0.2, then Cd1 is 0.2 and Cd2 is 0. … If error is 1 and velocity is 0.9, then Cd1 is 0 and Cd2 is 1. If error is 1 and velocity is 1, then Cd1 is 0 and Cd2 is 0.

Membership Function: Error

Membership Function: Velocity

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

0.2

0.4

0.6

0.8

1

0

0

0.2

Membership Function: Cd1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0.2

0.4

IROS 2009

0.6

0.6

0.8

1

0.8

1

Membership Function: Cd2

1

0

0.4

0.8

1

0

0

0.2

0.4

0.6

BELBIC Target Tracking Model (fuzzy clustering) §  Target Generator - Multi targets problem §  Error Analysis - assigning a new target - making SI

SI Target Genarator

§  Fuzzy Clustering - clustering according to the error and velocity

+

Distance / Angle

Fuzzy Clustering

Rew

BELBIC

u

X

v

Robot

error

§  BELBIC - robot velocity command from SI and Reward

IROS 2009

+ -

x, y

Angular Vel

w

BELBIC Target Tracking Model (fuzzy clustering) Robot Direction

Robot Trajectory Target at x=10 and y=12

60

12

50

10

Position[m]

with with only only

Clustering x Clustering y x y

40

Angle[Deg]

BELBIC BELBIC BELBIC BELBIC

8

6

4

20

2

10

0

0 0

10

20

30

40

50 60 Time[sec]

70

80

90

100

GA and Goc

-3

8

x 10

0

10

20

30

40

50 60 Time[sec]

70

80

90

100

Robot Trajectories(Multi Targets): BELBIC with/without Clustering 15 BELBIC with Clustering ----- BELBIC only

6

A

10 BELBIC with Clustering GA

4 2

Gain

30

BELBIC with Clustering GOC BELBIC only GA

5

BELBIC only GOC

0

B

0 -5

D

-2

-15

-6 -8

C

-10

-4

0

10

20

30

40

50 60 Time[sec]

70

80

90

100

-20

0

IROS 2009

5

10

15

20

25

30

Conclusion n BELBIC mobile robot tracks the target successfully n BELBIC with fuzzy clustering method works for target tracking n BELBIC is a temporal learning method (each time the robot learns appropriate gains) n Development of a higher level learning method is needed to achieve autonomous mobile robot IROS 2009

Future Works High Level - Long- Term memory - Planning - Mapping

Multi-objective Decision Making (AHP)

Low Level

- Target tracking - Obstacle avoidance

OFC

High Level

Amygdala

Low Level

Development of mobile robot navigation structure for Open World Model: Multi-objective Decision Making Method: Analytical Hierarchy Process (AHP) IROS 2009

Any Questions or Comments will be appreciated

Thank you !

IROS 2009

Recommend Documents