Abstract
This thesis presents research on deep learning optimization, emphasizing three separate
yet related topics: adaptive learning rates, first-order optimization for generative adversarial networks (GANs), and the effects of label smoothing.
Firstly, we propose a novel approach for obtaining adaptive learning rates in gradientbased descent methods for classification tasks. Departing from traditional methods that
rely on decayed expectations of gradient-based terms, our approach leverages the angle
between the current gradient and a new gradient computed from the orthogonal direction. By incorporating angle history, we determine adaptive learning rates that lead to
superior accuracy compared to existing state-of-the-art optimizers. We provide empirical evidence of convergence and evaluate our approach on diverse benchmark datasets,
employing prominent image classification architectures.
Secondly, we introduce a groundbreaking first-order optimization method tailored specifically for training GANs. Our method builds upon the Gauss-Newton method, approximating the min-max Hessian, and utilizes the Sherman-Morrison inversion formula to calculate
the inverse. Operating as a fixed-point method that ensures necessary contraction, our approach produces high-fidelity images with enhanced diversity across multiple datasets. Notably, it outperforms state-of-the-art second-order methods, including achieving the highest
inception score for CIFAR10. Additionally, our method demonstrates comparable execution times to first-order min-max methods.
Furthermore, we investigate the effects of label smoothing on GAN training, examining
various optimizer variants and learning rates. Our research reveals that employing label
smoothing with a high learning rate and the CGD optimizer yields results surpassing the
quality attained by using ADAM with the same learning rate. Importantly, we establish
that label smoothing plays a vital role, as its absence fails to generate comparable results. We also explore the impact of architectural changes on the generator’s conditioning,
providing valuable insights into the factors influencing GAN performance Our research advances the deep learning optimization field by delving into these interconnected areas. We present novel methodologies for adaptive learning rates, first-order
optimization for GANs, and the importance of label smoothing. These advancements offer
improved accuracy in classification tasks, enhanced image generation quality, and a deeper
understanding of the nuances of GAN training.