 So far, our policy was random. In that sense, we gained very little from using Monte Carlo Tree Search. Little has gained over just planning all possible futures. So which tree would we produce if the policy was simple but not flat? It's for you to try this out now.