Paper
Inference-only MCTS where expansion is sampling N children, evaluation is self-generated LM score + self-consistency score (max % agree)
Also on failure, generates Reflection to inform subsequent expansions