Nonlinear Optimization Techniques Applied to Neural Network Training
- Friday, 20. March 2026, 10:00
- INF 205, Room 02/414
- Leonie Kreis
Address
Mathematikon
Im Neuenheimer Feld 205
Room 02/414Event Type
Doctoral Examination
This thesis investigates how techniques from classical deterministic nonlinear optimization can be adapted to the stochastic, large-scale, and non-convex problems arising in neural network training. Three main contributions are presented.
First, stochastic multilevel optimization methods inspired by MGOpt are studied for both convex and non-convex objectives. Variants based on SGD are analyzed, including new convergence results for stochastic bi-level formulations, highlighting both the potential and limitations of multilevel approaches in neural networks.
Second, SensLI, a sensitivity-based method for adaptive layer insertion, is introduced. It provides an efficient criterion for selecting insertion locations, demonstrates effective capacity growth in numerical experiments, and is extended to layer widening.
Third, a layer-wise preconditioning framework based on Frobenius-type inner products is proposed. A covariance-based construction closely related to KFAC is developed and empirically evaluated.
Overall, the thesis shows that classical optimization ideas can benefit neural network training when adapted to stochastic, high-dimensional, and non-convex settings, and outlines directions for future research.