MCTS-Minimax Hybrids

Comment

Report 3 Downloads 64 Views

MCTS-Minimax Hybrids

Hendrik Baier ∙ Mark H.M. Winands

Overview

•

Motivation

2

Overview

•

• Motivation MCTS-Minimax Hybrid Algorithms

3

Overview

•

• Motivation MCTS-Minimax Hybrid Algorithms • Test Domains

4

Overview

•

• Motivation MCTS-Minimax Hybrid Algorithms • Test Domains • Experimental Results

5

Overview

• Motivation • MCTS-Minimax Hybrid Algorithms • Test Domains • Experimental Results • Conclusions and Future Work

6

Motivation: Monte-Carlo Tree Search

7

Motivation: Monte-Carlo Tree Search

8

Motivation: Minimax

9

MCTS-Minimax Hybrids

10

MCTS-Minimax Hybrids: MCTS-MR (Minimax Rollouts)

11

12

13

14

15

16

17

MCTS-Minimax Hybrids: MCTS-MS (Minimax Selection)

18

19

20

21

22

MCTS-Minimax Hybrids: MCTS-MB (Minimax Backpropagation)

23

24

25

26

27

28

29

Test Domains: Connect-4

30

Test Domains: Breakthrough

31

Experimental Results

32

Connect 4

Breakthrough

MCTS-MR

73.2%

27.0%

MCTS-MS

53.4%

62.2%

MCTS-MB

52.1%

55.0%

33

Experimental Results: Influence of Time Controls

34

35

36

Conclusions

•

Three knowledge-free ways of integrating minimax into MCTS

37

Conclusions

•

Three knowledge-free ways of integrating minimax into MCTS • Newly proposed MCTS-MB and MCTS-MS significantly outperform regular MCTS(-Solver)

38

Conclusions

•

•

Three knowledge-free ways of integrating minimax into MCTS • Newly proposed MCTS-MB and MCTS-MS significantly outperform regular MCTS-Solver MCTS-MR seems to be more sensitive to differences between search spaces (at least when used without knowledge)

39

Future Work •

Examine influence of algorithm properties such as speed and quality of rollouts (here uniformly random)

40

Future Work • •

Examine influence of algorithm properties such as speed and quality of rollouts (here uniformly random) Examine influence of game properties such as branching factor, game length, terminal state density, trap density, etc.

41

Future Work • • •

Examine influence of algorithm properties such as speed and quality of rollouts (here uniformly random) Examine influence of game properties such as branching factor, game length, terminal state density, trap density, etc. Incorporate knowledge in the form of evaluation functions – find ways of combining evaluation results with MCTS rollout returns

42

Additional Games

43

Terminal State Density

44

Questions?

45