Paper


The Core Innovation: Dynamic Width Decision
The key innovation is adaptive branching - dynamically deciding whether to "go wider" (explore new solutions) or "go deeper" (refine existing solutions) at each decision point during the search.
The Problem with Traditional Approaches
Traditional methods have fixed strategies:
- Repeated Sampling: Only explores new solutions (pure width)
- Sequential Refinement: Only refines one solution (pure depth)
- Standard MCTS: Fixed branching factor (e.g., always generate 5 new children)
These fixed strategies miss opportunities because different problems benefit from different exploration-exploitation balances.
How AB-MCTS Works
- The GEN Node Concept
- Every node in the search tree has a special "GEN" (generate) child node
- Selecting the GEN node means: generate a brand new solution
- Selecting an existing child means: refine that solution further
- Statistical Decision Making
- Uses Bayesian statistics to model the expected score of:
- A new solution (if we choose GEN)
- A refined solution (if we choose an existing child)
- Employs Thompson sampling to make the selection
Two Variants
AB-MCTS-M (Mixed Model)
- Uses a hierarchical Bayesian model
- Groups solutions by their parent node