Till innehåll på sidan

Zacharias Veiksaar: A Study of Portfolio Optimization in Discrete Time: From Markowitz to Reinforcement Learning

Bachelor's thesis in Mathematics

Tid: To 2025-08-28 kl 08.30 - 09.30

Plats: Cramér meeting room, Albano building 1

Respondent: Zacharias Veiksaar

Handledare: Yishao Zhou

Exportera till kalender

Abstract

This paper investigates portfolio optimization in discrete time, covering its development from the classical mean-variance framework, multi-period extensions, and modern reinforcement learning approaches. We begin with a rigorous treatment of the single-period case, deriving analytical solutions, highlighting their sensitivity to estimation errors, and proposing regularization as a solution to this sensitivity. We then extend the framework to a multi-period setting using dynamic programming, where we encounter time-inconsistency in the mean-variance formulation and propose a time-consistent reformulation that we solve analytically. As the reliance on estimating asset return distributions remains we propose reinforcement learning as a suitable model-free alternative, circumventing the need for explicit estimation of these parameters. We reformulate the time-consistent multi-period problem as a Markov decision process, prove that optimal solutions exist and argue these can be found within the reinforcement learning framework. The results highlight the mathematical structure of portfolio optimization, provide a broad treatment of limitations in classical approaches, and lay the groundwork for more robust machine learning methods as an alternative.