Sofia Näslund: Applicering av Gradient Boosting Machines och Extreme Gradient Boosting Machines inom skademodellering
Master thesis in Insurance Mathematics
Tid: Må 2025-06-09 kl 10.40 - 11.20
Plats: Cramér room, Department of Mathematics, floor 3, house 1, Albano
Respondent: Sofia Näslund
Handledare: Mathias Millberg Lindholm
Abstract.
One of the most critical aspects of insurance operations is developing pricing models that accurately reflect the true underlying risk structure. In non-life insurance, new approaches have gained increasing attention over the years, including ensemble methods such as Gradient Boosting Machines (GBM) and Extreme Gradient Boosting (XGBoost). Both GBM and XGBoost are built by sequentially added weak learners with the goal to minimize a specified loss function, therefore allowing the models to capture complex and non-linear relationships within the data.
The aim of this study is to further investigate the performance of these models in the context of claim count modeling. The models are evaluated based on their out-of-sample Poisson deviance and their ability to differentiate risk levels, which is illustrated through quantile plots. Due to the black-box nature of these metods, interpretability techniques such as variable importance and partial dependence plots are used.
The analysis is divided into two parts: the first is a simulation study for datasets of varying sizes, while the second one involves the application of the models on two real insurance datasets. Since the distribution of the simulated data is known, the models in this part are also evaluated by their ability to recover the true expected frequency for each observation.
The results from both analyses indicate that GBM generally provides stronger risk differentiation and slightly better generalization performance, even in cases where XGBoost had a notably more complex model structure. GBM outperformed XGBoost on a highly imbalanced dataset, a finding that is in contrast with previous research and suggests that GBM may be more robust under such conditions.