Fast and Loose in Bounded Suboptimal Heuristic ... - Semantic Scholar

Report 2 Downloads 75 Views
Fast and Loose in Bounded Suboptimal Heuristic Search Jordan Thayer and Wheeler Ruml

Ephrat Bitton

{jtd7, ruml} at cs.unh.edu

ebitton at berkeley.edu

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 1 / 35

Motivation Introduction ■ Motivation



Finding optimal solutions is prohibitively expensive.

Weighted A∗ Clamped Adaptive Optimistic Search

Nodes generated (relative to A*)

200,000

Four-way Grid Pathfinding (Unit cost) A*

Four-way Grid Pathfinding (Unit cost) A*

Solution Cost (relative to A*)

Conclusion

100,000

0

1.4

1.2

1.0 200

400

600

Problem Size

Jordan Thayer (UNH)

1.6

800

1,000

200

400

600

800

1,000

Problem Size

Fast Bounded Suboptimal Search – 2 / 35

Motivation Introduction ■ Motivation

■ ■

Finding optimal solutions is prohibitively expensive. Its nice to limit suboptimality.

Weighted A∗ Clamped Adaptive Optimistic Search

Nodes generated (relative to A*)

200,000

Four-way Grid Pathfinding (Unit cost) A*

Four-way Grid Pathfinding (Unit cost) A*

Greedy

Greedy Solution Cost (relative to A*)

Conclusion

100,000

0

1.4

1.2

1.0 200

400

600

Problem Size

Jordan Thayer (UNH)

1.6

800

1,000

200

400

600

800

1,000

Problem Size

Fast Bounded Suboptimal Search – 3 / 35

Motivation Introduction ■ Motivation Weighted A∗

■ ■ ■

Finding optimal solutions is prohibitively expensive. Its nice to limit suboptimality. Weighted A* is a popular method for doing that.

Clamped Adaptive Optimistic Search

Nodes generated (relative to A*)

200,000

Four-way Grid Pathfinding (Unit cost) A* wA*

Four-way Grid Pathfinding (Unit cost) A* wA*

Greedy

Greedy Solution Cost (relative to A*)

Conclusion

100,000

0

1.4

1.2

1.0 200

400

600

Problem Size

Jordan Thayer (UNH)

1.6

800

1,000

200

400

600

800

1,000

Problem Size

Fast Bounded Suboptimal Search – 4 / 35

Motivation Introduction ■ Motivation

■ ■

Weighted A∗



Clamped Adaptive



Finding optimal solutions is prohibitively expensive. Its nice to limit suboptimality. Weighted A* is a popular method for doing that. This talk: two algorithms which are often better.

Optimistic Search

Nodes generated (relative to A*)

200,000

Four-way Grid Pathfinding (Unit cost) A* wA* Clamped Adaptive Optimistic Greedy

Four-way Grid Pathfinding (Unit cost) A* wA* Clamped Adaptive Optimistic Greedy Solution Cost (relative to A*)

Conclusion

100,000

0

1.4

1.2

1.0 200

400

600

Problem Size

Jordan Thayer (UNH)

1.6

800

1,000

200

400

600

800

1,000

Problem Size

Fast Bounded Suboptimal Search – 5 / 35

Talk Outline Introduction Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance



Background Weighted A*



Strict Approach: Clamped Adaptive Correct for underestimating h(n) Bound correction to ensure w-admissibility



Loose Approach: Optimistic Search Greedily search for a solution Enforce suboptimality bound afterwards

Clamped Adaptive Optimistic Search Conclusion

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 6 / 35

Weighted A∗ (Pohl, 1970) Introduction

A∗ is a best first search ordered on f (n) = g(n) + h(n)

Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance Clamped Adaptive Optimistic Search Conclusion

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 7 / 35

Weighted A∗ (Pohl, 1970) Introduction

A∗ is a best first search ordered on f (n) = g(n) + h(n)

Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance Clamped Adaptive Optimistic Search Conclusion

Weighted A∗ : f ′ (n) = g(n) + w · h(n)

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 7 / 35

Weighted A∗ (Pohl, 1970) Introduction

A∗ is a best first search ordered on f (n) = g(n) + h(n)

Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance Clamped Adaptive Optimistic Search Conclusion

Weighted A∗ : f ′ (n) = g(n) + w · h(n)

What does w do? breaks ties on f (n) in favor of high g(n) corrects for underestimating h(n) deepens search / emphasises greed Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 7 / 35

Weighted A∗ Respects a Bound p is a node in open on an optimal path to opt f (n) = g(n) + h(n) f ′ (n) = g(n) + w · h(n)

Introduction Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance Clamped Adaptive Optimistic Search Conclusion

g(sol) f ′ (sol) ≤ f ′ (p) g(p) + w · h(p) ≤ w · (g(p) + h(p)) w · f (p) ≤ w · f (opt) w · g(opt) Therefore, g(sol) ≤ w · g(opt)

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 8 / 35

Weighted A∗ is a Popular Choice Introduction

■ ■

Clamped Adaptive Optimistic Search



Conclusion

■ ■

Weighted A* Pohl (1970) Dynamically Weighted A* Pohl (1973) Aǫ Ghallab & Allard (1983) A∗ǫ Pearl (1984) AlphA* Reese & Frichs (unpublished)

Nodes generated (relative to A*)

Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance

Eight-way Grid Pathfinding (Unit cost) dwA* A* eps AlphA* wA*

0.9

0.6

0.3

0.0 1.2

1.5

1.8

Sub-optimality Bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 9 / 35

Talk Outline Introduction Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance



Background Weighted A*

Optimistic Search



Strict Approach: Clamped Adaptive Correct for underestimating h(n) Bound correction to ensure w-admissibility



Loose Approach: Optimistic Search Greedily search for a solution Enforce suboptimality bound afterwards

Conclusion

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 10 / 35

Improving Weighted A∗ Introduction



Weighted A∗



Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance



If h were perfect, solutions would be found in linear time. How do we improve h(n)? By correcting for the error in h(n) We’ll ensure w-admissibility shortly.

Optimistic Search Conclusion

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 11 / 35

Correcting h(n) with one step error Introduction

Consider the single expansion:

p

Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search

bc Recall that f (n) = g(n) + h(n) ■

Conclusion



f (n) should remain constant across parent and child. if f (n) = g(n) + h∗ (n) this would be true. g(n) is exact. All the error in f (n) comes from h(n). errh = f (bc) − f (p)

Track a running average of errh . ˆ fˆ(n) = g(n) + h(n) ˆ h(n) = h(n) · (1 + errh )

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 12 / 35

Correcting h(n) with one step error Introduction

Consider the single expansion:

p

Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search

bc Recall that f (n) = g(n) + h(n) ■

Conclusion



f (n) should remain constant across parent and child. if f (n) = g(n) + h∗ (n) this would be true. g(n) is exact. All the error in f (n) comes from h(n). errh = f (bc) − f (p)

Track a running average of errh . ˆ fˆ(n) = g(n) + h(n) ˆ h(n) = h(n) · (1 + errh ) ˆ h(n) is inadmissible. Clamping enforces w-admissibility. Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 12 / 35

Admissibility of Clamping: Weighted A* p is a node in open on an optimal path to opt f (n) = g(n) + h(n) f ′ (n) = g(n) + w · h(n)

Introduction Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search Conclusion

g(sol) f ′ (sol) ≤ f ′ (p) g(p) + w · h(p) ≤ w · (g(p) + h(p)) w · f (p) ≤ w · f (opt) w · g(opt) Therefore, g(sol) ≤ w · g(opt)

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 13 / 35

Admissibility of Clamping: Clamped Adaptive p is a node in open on an optimal path to opt f (n) = g(n) + h(n) fe(n) = min(fb(n), w · f (n))

Introduction Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search Conclusion

g(sol) = fe(sol) fe(sol) ≤ fe(p) fe(p) ≤ w · f (p) w · f (p) ≤ w · f (opt) And g(s) ≤ w · g(opt) is still true.

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 14 / 35

Empirical Evaluation Introduction



Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search Conclusion



Grid world path finding Four-way and Eight-way Movement Unit and Life Cost Models 25%, 30%, 35%, 40%, 45% obstacles Temporal Planning Blocksworld, Logistics, Rover, Satellite, Zenotravel

See the paper for details.

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 15 / 35

Performance of Clamped Adaptive Introduction Weighted A∗

Optimistic Search Conclusion

Four-way Grid Pathfinding (Unit cost) wA* Clamped Adaptive

0.9

Nodes generated (relative to A*)

Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance

0.6

0.3

0.0 1

2

3

Sub-optimality Bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 16 / 35

Performance of Clamped Adaptive Introduction Weighted A∗

Optimistic Search Conclusion

zenotravel (problem 2) wA* Clamped Adaptive Nodes generated (relative to A*)

Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance

1.2

0.8

0.4

0.0 1

2

3

Sub-optimality Bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 17 / 35

Performance of Clamped Adaptive Introduction Weighted A∗

Optimistic Search Conclusion

satellite (problem 2) wA* Clamped Adaptive Nodes generated (relative to A*)

Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance

1.2

0.8

0.4

0.0 1

2

3

Sub-optimality Bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 18 / 35

Performance of Clamped Adaptive Introduction Weighted A∗

Optimistic Search Conclusion

logistics (problem 3) Clamped Adaptive wA* Nodes generated (relative to A*)

Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance

1.2

0.8

0.4

0.0 1

2

3

Sub-optimality Bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 19 / 35

Clamped Adaptive: Summary Introduction Weighted A∗

Clamped Adaptive: ■

Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search



Conclusion



On-line heuristic correction seems promising Performance varies Does well for small bounds Fails to become greedy No parameter tuning needed Clamping for admissibility of inadmissible heuristics

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 20 / 35

Talk Outline Introduction Weighted A∗ Clamped Adaptive



Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance

Background Weighted A*



Strict Approach: Clamped Adaptive Correct for underestimating h(n) Bound correction to ensure w-admissibility



Loose Approach: Optimistic Search Greedily search for a solution Enforce suboptimality bound afterwards

Conclusion

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 21 / 35

Weighted A∗ Respects a Bound f (n) = g(n) + h(n) f ′ (n) = g(n) + w · h(n)

Introduction Weighted A∗ Clamped Adaptive Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance Conclusion

g(sol) f ′ (sol) ≤ f ′ (p) g(p) + w · h(p) ≤ w · (g(p) + h(p)) w · f (p) ≤ w · f (opt) w · g(opt) Therefore, g(sol) ≤ w · g(opt)

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 22 / 35

Weighted A∗ Respects the Bound and Then Some f (n) = g(n) + h(n) f ′ (n) = g(n) + w · h(n)

Introduction Weighted A∗ Clamped Adaptive Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance Conclusion

g(sol) f ′ (sol) ≤ f ′ (p) g(p) + w · h(p) ≤ w · (g(p) + h(p)) w · f (p) ≤ w · f (opt) w · g(opt) g(p) + w · h(p) ≤ w · g(p) + w · h(p)

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 23 / 35

Solution Quality v. Bound

Weighted



A∗

Clamped Adaptive



Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance



wA∗ returns solutions better than the bound. Be optimistic Run with higher weight

Conclusion

Solution Cost (relative to A*)

Introduction

Four-way Grid Pathfinding (Unit cost) 3

y=x wA*

2

1 1

2

3

Sub-optimality Bound

How do we guarantee a suboptimality bound?

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 24 / 35

Enforcing the Bound ■

Introduction Weighted A∗

p is the deepest node on an optimal path to opt

Clamped Adaptive Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance Conclusion

f (p) ≤ f (opt) f (fmin ) ≤ f (p) fmin provides a lower bound on solution cost. Determine fmin by priority queue sorted on f Optimistic Search: Run a greedy search Expand fmin until w · fmin ≥ f (sol)

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 25 / 35

Empirical Evaluation Introduction Weighted A∗

This Paper: ■

Clamped Adaptive Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance



Conclusion

Grid world path finding Four-way and Eight-way Movement Unit and Life Cost Models 25 Temporal Planning Blocksworld, Logistics, Rover, Satellite, Zenotravel

To Appear in ICAPS: ■



Traveling Salesman Unit Square Pearl and Kim Hard Sliding Tile Puzzles Korf’s 100 15-puzzle instances

See papers for details.

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 26 / 35

Performance of Optimistic Search Introduction Weighted A∗

Pearl and Kim Hard

Clamped Adaptive

Conclusion

0.9

Node Generations Relative to A*

Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance

wA* Optimistic

0.6

0.3

0.0 1.0

1.1

1.2

Sub-optimality bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 27 / 35

Performance of Optimistic Search Introduction Weighted A∗

Korf’s 15 Puzzles

Clamped Adaptive

Conclusion

0.09

Node Generations Relative to IDA*

Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance

wA* Optimistic

0.06

0.03

0.0 1.2

1.4

1.6

1.8

2.0

Sub-optimality bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 28 / 35

Performance of Optimistic Search Introduction Weighted A∗

Four-way Grid Pathfinding (Unit cost)

Clamped Adaptive

Conclusion

wA* Optimistic

0.9

Nodes generated (relative to A*)

Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance

0.6

0.3

0.0 1

2

3

Sub-optimality Bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 29 / 35

Conclusion Introduction

Clamped Adaptive:

Weighted A∗



Clamped Adaptive



On-line heuristic correction seems promising. No parameter tuning needed.

Optimistic Search Conclusion ■ Conclusion ■ Advertising

Optimistic Search: ■ ■

Performance is predictable. Current results are good, could be improved.

We have two algorithms that can outperform weighted A∗ We can use arbitrary heuristics for w-admissible search.

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 30 / 35

The University of New Hampshire Introduction

Tell your students to apply to grad school in CS at UNH!

Weighted A∗ Clamped Adaptive Optimistic Search Conclusion ■ Conclusion ■ Advertising

■ ■ ■ ■ ■ ■ ■

Jordan Thayer (UNH)

friendly faculty funding individual attention beautiful campus low cost of living easy access to Boston, White Mountains strong in AI, infoviz, networking, systems, bioinformatics

Fast Bounded Suboptimal Search – 31 / 35

Bounded Anytime Weighted A* Introduction Weighted A∗

Korf’s 15 Puzzles

Clamped Adaptive Optimistic Search

Node Generations Relative to IDA*

Bounded Anytime Weighted A*

BAwA* wA* Optimistic

0.09

Conclusion

0.06

0.03

0.0 1.2

1.4

1.6

1.8

2.0

Sub-optimality bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 32 / 35

Bounded Anytime Weighted A* Introduction Weighted A∗

Pearl and Kim Hard

Clamped Adaptive Optimistic Search Conclusion

Node Generations Relative to A*

Bounded Anytime Weighted A*

0.9

BAwA* wA* Optimistic

0.6

0.3

0.0 1.0

1.1

1.2

Sub-optimality bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 33 / 35

Duplicate Dropping can be Important Introduction Weighted A∗

Four-way Grid Pathfinding (Unit cost)

Clamped Adaptive Optimistic Search Conclusion

0.9

Nodes generated (relative to A*)

Duplicate Dropping

wA* wA* dd

0.6

0.3

0.0 1

2

3

Sub-optimality Bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 34 / 35

Sometimes it isn’t Introduction Weighted A∗

Korf’s 15 puzzles

Clamped Adaptive Optimistic Search 0.09

Node Generations Relative to IDA*

Conclusion Duplicate Dropping

wA* dd wA*

0.06

0.03

0.0 1.1

1.2

1.3

1.4

1.5

Sub-optimality bound

Jordan Thayer (UNH)

Fast Bounded Suboptimal Search – 35 / 35