Fast and Loose in Bounded Suboptimal Heuristic Search Jordan Thayer and Wheeler Ruml
Ephrat Bitton
{jtd7, ruml} at cs.unh.edu
ebitton at berkeley.edu
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 1 / 35
Motivation Introduction ■ Motivation
■
Finding optimal solutions is prohibitively expensive.
Weighted A∗ Clamped Adaptive Optimistic Search
Nodes generated (relative to A*)
200,000
Four-way Grid Pathfinding (Unit cost) A*
Four-way Grid Pathfinding (Unit cost) A*
Solution Cost (relative to A*)
Conclusion
100,000
0
1.4
1.2
1.0 200
400
600
Problem Size
Jordan Thayer (UNH)
1.6
800
1,000
200
400
600
800
1,000
Problem Size
Fast Bounded Suboptimal Search – 2 / 35
Motivation Introduction ■ Motivation
■ ■
Finding optimal solutions is prohibitively expensive. Its nice to limit suboptimality.
Weighted A∗ Clamped Adaptive Optimistic Search
Nodes generated (relative to A*)
200,000
Four-way Grid Pathfinding (Unit cost) A*
Four-way Grid Pathfinding (Unit cost) A*
Greedy
Greedy Solution Cost (relative to A*)
Conclusion
100,000
0
1.4
1.2
1.0 200
400
600
Problem Size
Jordan Thayer (UNH)
1.6
800
1,000
200
400
600
800
1,000
Problem Size
Fast Bounded Suboptimal Search – 3 / 35
Motivation Introduction ■ Motivation Weighted A∗
■ ■ ■
Finding optimal solutions is prohibitively expensive. Its nice to limit suboptimality. Weighted A* is a popular method for doing that.
Clamped Adaptive Optimistic Search
Nodes generated (relative to A*)
200,000
Four-way Grid Pathfinding (Unit cost) A* wA*
Four-way Grid Pathfinding (Unit cost) A* wA*
Greedy
Greedy Solution Cost (relative to A*)
Conclusion
100,000
0
1.4
1.2
1.0 200
400
600
Problem Size
Jordan Thayer (UNH)
1.6
800
1,000
200
400
600
800
1,000
Problem Size
Fast Bounded Suboptimal Search – 4 / 35
Motivation Introduction ■ Motivation
■ ■
Weighted A∗
■
Clamped Adaptive
■
Finding optimal solutions is prohibitively expensive. Its nice to limit suboptimality. Weighted A* is a popular method for doing that. This talk: two algorithms which are often better.
Optimistic Search
Nodes generated (relative to A*)
200,000
Four-way Grid Pathfinding (Unit cost) A* wA* Clamped Adaptive Optimistic Greedy
Four-way Grid Pathfinding (Unit cost) A* wA* Clamped Adaptive Optimistic Greedy Solution Cost (relative to A*)
Conclusion
100,000
0
1.4
1.2
1.0 200
400
600
Problem Size
Jordan Thayer (UNH)
1.6
800
1,000
200
400
600
800
1,000
Problem Size
Fast Bounded Suboptimal Search – 5 / 35
Talk Outline Introduction Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance
■
Background Weighted A*
■
Strict Approach: Clamped Adaptive Correct for underestimating h(n) Bound correction to ensure w-admissibility
■
Loose Approach: Optimistic Search Greedily search for a solution Enforce suboptimality bound afterwards
Clamped Adaptive Optimistic Search Conclusion
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 6 / 35
Weighted A∗ (Pohl, 1970) Introduction
A∗ is a best first search ordered on f (n) = g(n) + h(n)
Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance Clamped Adaptive Optimistic Search Conclusion
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 7 / 35
Weighted A∗ (Pohl, 1970) Introduction
A∗ is a best first search ordered on f (n) = g(n) + h(n)
Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance Clamped Adaptive Optimistic Search Conclusion
Weighted A∗ : f ′ (n) = g(n) + w · h(n)
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 7 / 35
Weighted A∗ (Pohl, 1970) Introduction
A∗ is a best first search ordered on f (n) = g(n) + h(n)
Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance Clamped Adaptive Optimistic Search Conclusion
Weighted A∗ : f ′ (n) = g(n) + w · h(n)
What does w do? breaks ties on f (n) in favor of high g(n) corrects for underestimating h(n) deepens search / emphasises greed Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 7 / 35
Weighted A∗ Respects a Bound p is a node in open on an optimal path to opt f (n) = g(n) + h(n) f ′ (n) = g(n) + w · h(n)
Introduction Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance Clamped Adaptive Optimistic Search Conclusion
g(sol) f ′ (sol) ≤ f ′ (p) g(p) + w · h(p) ≤ w · (g(p) + h(p)) w · f (p) ≤ w · f (opt) w · g(opt) Therefore, g(sol) ≤ w · g(opt)
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 8 / 35
Weighted A∗ is a Popular Choice Introduction
■ ■
Clamped Adaptive Optimistic Search
■
Conclusion
■ ■
Weighted A* Pohl (1970) Dynamically Weighted A* Pohl (1973) Aǫ Ghallab & Allard (1983) A∗ǫ Pearl (1984) AlphA* Reese & Frichs (unpublished)
Nodes generated (relative to A*)
Weighted A∗ ■ Weighted A∗ ■ Bounding ■ Performance
Eight-way Grid Pathfinding (Unit cost) dwA* A* eps AlphA* wA*
0.9
0.6
0.3
0.0 1.2
1.5
1.8
Sub-optimality Bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 9 / 35
Talk Outline Introduction Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance
■
Background Weighted A*
Optimistic Search
■
Strict Approach: Clamped Adaptive Correct for underestimating h(n) Bound correction to ensure w-admissibility
■
Loose Approach: Optimistic Search Greedily search for a solution Enforce suboptimality bound afterwards
Conclusion
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 10 / 35
Improving Weighted A∗ Introduction
■
Weighted A∗
■
Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance
■
If h were perfect, solutions would be found in linear time. How do we improve h(n)? By correcting for the error in h(n) We’ll ensure w-admissibility shortly.
Optimistic Search Conclusion
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 11 / 35
Correcting h(n) with one step error Introduction
Consider the single expansion:
p
Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search
bc Recall that f (n) = g(n) + h(n) ■
Conclusion
■
f (n) should remain constant across parent and child. if f (n) = g(n) + h∗ (n) this would be true. g(n) is exact. All the error in f (n) comes from h(n). errh = f (bc) − f (p)
Track a running average of errh . ˆ fˆ(n) = g(n) + h(n) ˆ h(n) = h(n) · (1 + errh )
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 12 / 35
Correcting h(n) with one step error Introduction
Consider the single expansion:
p
Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search
bc Recall that f (n) = g(n) + h(n) ■
Conclusion
■
f (n) should remain constant across parent and child. if f (n) = g(n) + h∗ (n) this would be true. g(n) is exact. All the error in f (n) comes from h(n). errh = f (bc) − f (p)
Track a running average of errh . ˆ fˆ(n) = g(n) + h(n) ˆ h(n) = h(n) · (1 + errh ) ˆ h(n) is inadmissible. Clamping enforces w-admissibility. Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 12 / 35
Admissibility of Clamping: Weighted A* p is a node in open on an optimal path to opt f (n) = g(n) + h(n) f ′ (n) = g(n) + w · h(n)
Introduction Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search Conclusion
g(sol) f ′ (sol) ≤ f ′ (p) g(p) + w · h(p) ≤ w · (g(p) + h(p)) w · f (p) ≤ w · f (opt) w · g(opt) Therefore, g(sol) ≤ w · g(opt)
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 13 / 35
Admissibility of Clamping: Clamped Adaptive p is a node in open on an optimal path to opt f (n) = g(n) + h(n) fe(n) = min(fb(n), w · f (n))
Introduction Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search Conclusion
g(sol) = fe(sol) fe(sol) ≤ fe(p) fe(p) ≤ w · f (p) w · f (p) ≤ w · f (opt) And g(s) ≤ w · g(opt) is still true.
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 14 / 35
Empirical Evaluation Introduction
■
Weighted A∗ Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search Conclusion
■
Grid world path finding Four-way and Eight-way Movement Unit and Life Cost Models 25%, 30%, 35%, 40%, 45% obstacles Temporal Planning Blocksworld, Logistics, Rover, Satellite, Zenotravel
See the paper for details.
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 15 / 35
Performance of Clamped Adaptive Introduction Weighted A∗
Optimistic Search Conclusion
Four-way Grid Pathfinding (Unit cost) wA* Clamped Adaptive
0.9
Nodes generated (relative to A*)
Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance
0.6
0.3
0.0 1
2
3
Sub-optimality Bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 16 / 35
Performance of Clamped Adaptive Introduction Weighted A∗
Optimistic Search Conclusion
zenotravel (problem 2) wA* Clamped Adaptive Nodes generated (relative to A*)
Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance
1.2
0.8
0.4
0.0 1
2
3
Sub-optimality Bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 17 / 35
Performance of Clamped Adaptive Introduction Weighted A∗
Optimistic Search Conclusion
satellite (problem 2) wA* Clamped Adaptive Nodes generated (relative to A*)
Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance
1.2
0.8
0.4
0.0 1
2
3
Sub-optimality Bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 18 / 35
Performance of Clamped Adaptive Introduction Weighted A∗
Optimistic Search Conclusion
logistics (problem 3) Clamped Adaptive wA* Nodes generated (relative to A*)
Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance
1.2
0.8
0.4
0.0 1
2
3
Sub-optimality Bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 19 / 35
Clamped Adaptive: Summary Introduction Weighted A∗
Clamped Adaptive: ■
Clamped Adaptive ■ Improving wA∗ ■ Correcting h(n) ■ w-Admissibility ■ Performance Optimistic Search
■
Conclusion
■
On-line heuristic correction seems promising Performance varies Does well for small bounds Fails to become greedy No parameter tuning needed Clamping for admissibility of inadmissible heuristics
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 20 / 35
Talk Outline Introduction Weighted A∗ Clamped Adaptive
■
Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance
Background Weighted A*
■
Strict Approach: Clamped Adaptive Correct for underestimating h(n) Bound correction to ensure w-admissibility
■
Loose Approach: Optimistic Search Greedily search for a solution Enforce suboptimality bound afterwards
Conclusion
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 21 / 35
Weighted A∗ Respects a Bound f (n) = g(n) + h(n) f ′ (n) = g(n) + w · h(n)
Introduction Weighted A∗ Clamped Adaptive Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance Conclusion
g(sol) f ′ (sol) ≤ f ′ (p) g(p) + w · h(p) ≤ w · (g(p) + h(p)) w · f (p) ≤ w · f (opt) w · g(opt) Therefore, g(sol) ≤ w · g(opt)
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 22 / 35
Weighted A∗ Respects the Bound and Then Some f (n) = g(n) + h(n) f ′ (n) = g(n) + w · h(n)
Introduction Weighted A∗ Clamped Adaptive Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance Conclusion
g(sol) f ′ (sol) ≤ f ′ (p) g(p) + w · h(p) ≤ w · (g(p) + h(p)) w · f (p) ≤ w · f (opt) w · g(opt) g(p) + w · h(p) ≤ w · g(p) + w · h(p)
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 23 / 35
Solution Quality v. Bound
Weighted
■
A∗
Clamped Adaptive
■
Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance
■
wA∗ returns solutions better than the bound. Be optimistic Run with higher weight
Conclusion
Solution Cost (relative to A*)
Introduction
Four-way Grid Pathfinding (Unit cost) 3
y=x wA*
2
1 1
2
3
Sub-optimality Bound
How do we guarantee a suboptimality bound?
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 24 / 35
Enforcing the Bound ■
Introduction Weighted A∗
p is the deepest node on an optimal path to opt
Clamped Adaptive Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance Conclusion
f (p) ≤ f (opt) f (fmin ) ≤ f (p) fmin provides a lower bound on solution cost. Determine fmin by priority queue sorted on f Optimistic Search: Run a greedy search Expand fmin until w · fmin ≥ f (sol)
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 25 / 35
Empirical Evaluation Introduction Weighted A∗
This Paper: ■
Clamped Adaptive Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance
■
Conclusion
Grid world path finding Four-way and Eight-way Movement Unit and Life Cost Models 25 Temporal Planning Blocksworld, Logistics, Rover, Satellite, Zenotravel
To Appear in ICAPS: ■
■
Traveling Salesman Unit Square Pearl and Kim Hard Sliding Tile Puzzles Korf’s 100 15-puzzle instances
See papers for details.
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 26 / 35
Performance of Optimistic Search Introduction Weighted A∗
Pearl and Kim Hard
Clamped Adaptive
Conclusion
0.9
Node Generations Relative to A*
Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance
wA* Optimistic
0.6
0.3
0.0 1.0
1.1
1.2
Sub-optimality bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 27 / 35
Performance of Optimistic Search Introduction Weighted A∗
Korf’s 15 Puzzles
Clamped Adaptive
Conclusion
0.09
Node Generations Relative to IDA*
Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance
wA* Optimistic
0.06
0.03
0.0 1.2
1.4
1.6
1.8
2.0
Sub-optimality bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 28 / 35
Performance of Optimistic Search Introduction Weighted A∗
Four-way Grid Pathfinding (Unit cost)
Clamped Adaptive
Conclusion
wA* Optimistic
0.9
Nodes generated (relative to A*)
Optimistic Search ■ Loose Bounds ■ Solution Quality ■ w-Admissibility ■ Performance
0.6
0.3
0.0 1
2
3
Sub-optimality Bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 29 / 35
Conclusion Introduction
Clamped Adaptive:
Weighted A∗
■
Clamped Adaptive
■
On-line heuristic correction seems promising. No parameter tuning needed.
Optimistic Search Conclusion ■ Conclusion ■ Advertising
Optimistic Search: ■ ■
Performance is predictable. Current results are good, could be improved.
We have two algorithms that can outperform weighted A∗ We can use arbitrary heuristics for w-admissible search.
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 30 / 35
The University of New Hampshire Introduction
Tell your students to apply to grad school in CS at UNH!
Weighted A∗ Clamped Adaptive Optimistic Search Conclusion ■ Conclusion ■ Advertising
■ ■ ■ ■ ■ ■ ■
Jordan Thayer (UNH)
friendly faculty funding individual attention beautiful campus low cost of living easy access to Boston, White Mountains strong in AI, infoviz, networking, systems, bioinformatics
Fast Bounded Suboptimal Search – 31 / 35
Bounded Anytime Weighted A* Introduction Weighted A∗
Korf’s 15 Puzzles
Clamped Adaptive Optimistic Search
Node Generations Relative to IDA*
Bounded Anytime Weighted A*
BAwA* wA* Optimistic
0.09
Conclusion
0.06
0.03
0.0 1.2
1.4
1.6
1.8
2.0
Sub-optimality bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 32 / 35
Bounded Anytime Weighted A* Introduction Weighted A∗
Pearl and Kim Hard
Clamped Adaptive Optimistic Search Conclusion
Node Generations Relative to A*
Bounded Anytime Weighted A*
0.9
BAwA* wA* Optimistic
0.6
0.3
0.0 1.0
1.1
1.2
Sub-optimality bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 33 / 35
Duplicate Dropping can be Important Introduction Weighted A∗
Four-way Grid Pathfinding (Unit cost)
Clamped Adaptive Optimistic Search Conclusion
0.9
Nodes generated (relative to A*)
Duplicate Dropping
wA* wA* dd
0.6
0.3
0.0 1
2
3
Sub-optimality Bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 34 / 35
Sometimes it isn’t Introduction Weighted A∗
Korf’s 15 puzzles
Clamped Adaptive Optimistic Search 0.09
Node Generations Relative to IDA*
Conclusion Duplicate Dropping
wA* dd wA*
0.06
0.03
0.0 1.1
1.2
1.3
1.4
1.5
Sub-optimality bound
Jordan Thayer (UNH)
Fast Bounded Suboptimal Search – 35 / 35