Gradient Descent

[Paper Exploration] Deep Residual Learning for Image Recognition

The paper introduces a novel architecture called residual networks (ResNets), which significantly improves deep neural network training by using skip connections to mitigate the vanishing gradient problem. This approach achieved state-of-the-art performance on several benchmarks, including the ImageNet dataset, and has become foundational in modern deep learning applications.

paper-exploration sota / gradient descent / machine learning / deep learning / resnet

2024-08-07

[Paper Exploration] Adam: A Method for Stochastic Optimization

From optimization, to convex optimization, to first order optimization, to gradient descent, to accelerated gradient descent, to AdaGrad, to Adam.

paper-exploration calculus / optimization / gradient descent / machine learning

2024-03-19