Johan Hallberg: Single- and Multi-Fidelity Bandits in Monte Carlo Tree Search - from the Casino to Mobile Network Optimization
Bachelor thesis presentation
Time: Tue 2022-08-23 10.30 - 11.30
Location: Albano, House 1, Room T
Respondent: Johan Hallberg
Mobile networks are made of several base stations, each one with a number of antennas, that require the optimization of several parameters in order to provide good service. Automatic optimization of these parameters is difficult as the size of the search space is exponential in the number of antennas, which is typically large. The recent success of Monte Carlo Tree Search methods in large and complex games, such as Go, suggests that similar methods might be used in large discrete black-box optimization, as they can be modeled as sequential decision-making problems. In this thesis, we study and apply a version of the famous upper confidence bound for trees (UCT) algorithm to black-box optimization. We state and prove a form of the UCT algorithm that allows for more general distributions. Moreover, we suggest a novel Multi-fidelity UCT algorithm based on the multi-fidelity bandit problem. Experimental results indicate that this new algorithm finds at least as good solutions in a shorter time, given an existing low-fidelity approximation. We discuss and empirically test different low-fidelity approximations.