Zacharias Veiksaar: A Study of Portfolio Optimization in Discrete Time: From Markowitz to Reinforcement Learning
Bachelor's thesis in Mathematics
Tid: To 2025-08-28 kl 08.30 - 09.30
Plats: Cramér meeting room, Albano building 1
Respondent: Zacharias Veiksaar
Handledare: Yishao Zhou
Abstract
This paper investigates portfolio optimization in discrete time, covering its development from the classical mean-variance framework, multi-period extensions, and modern reinforcement learning approaches. We begin with a rigorous treatment of the single-period case, deriving analytical solutions, highlighting their sensitivity to estimation errors, and proposing regularization as a solution to this sensitivity. We then extend the framework to a multi-period setting using dynamic programming, where we encounter time-inconsistency in the mean-variance formulation and propose a time-consistent reformulation that we solve analytically. As the reliance on estimating asset return distributions remains we propose reinforcement learning as a suitable model-free alternative, circumventing the need for explicit estimation of these parameters. We reformulate the time-consistent multi-period problem as a Markov decision process, prove that optimal solutions exist and argue these can be found within the reinforcement learning framework. The results highlight the mathematical structure of portfolio optimization, provide a broad treatment of limitations in classical approaches, and lay the groundwork for more robust machine learning methods as an alternative.
