Ludwig Hedlin: Convergence of linear neural networks to global minimizers

Tid: Ti 2020-10-13 kl 11.15

Medverkande: Ludwig Hedlin, KTH

Abstract

It is known that gradient flow in linear neural networks using Euclidean loss almost always avoids critical points that have at least one eigendirection with negative curvature. Using algebraic invariants of the gradient flow we try to prove that the set of all critical points with no second-order curvature (zero Hessian) for arbitrary networks is associated to a subset of the invariants of lower dimension. This would mean that these critical points are almost surely avoided. We show that this holds for networks with 3 or less hidden layers and a few other special cases. We show by way of explicit counter-example that it is not true for general deep networks. This talk is the presentation of a masters thesis.

Notes: The seminar will take place in F11 for the first 18 people to arrive. Overflow audience and those who are working from home can participate via Zoom with meeting ID 62586628413 .

Till kalendern