Paper-Exploration

The paper introduces a novel architecture called residual networks (ResNets), which significantly improves deep neural network training by using skip connections to mitigate the vanishing gradient problem. This approach achieved state-of-the-art performance on several benchmarks, including the ImageNet dataset, and has become foundational in modern deep learning applications.
A machine learning paradigm where a model is trained on certain tasks and then applied to new, unseen tasks without additional training. It leverages generalizable knowledge to perform well on tasks it has not explicitly encountered during training. (an instance og transfer learning)
From optimization, to convex optimization, to first order optimization, to gradient descent, to accelerated gradient descent, to AdaGrad, to Adam.
Statistical Modeling: The Two Cultures is an influential essay by Leo Breiman that delineates two approaches to statistical modeling: the "data modeling" culture, which emphasizes formal statistical inference and model fitting, and the "algorithmic modeling" culture, which prioritizes predictive accuracy and computational efficiency. Breiman argues for a shift towards the latter culture, advocating for the development and use of robust algorithms and machine learning techniques that focus on prediction rather than solely on theoretical statistical inference.
Paper exploration on SMOTE, or Synthetic Minority Over-sampling Technique, which was introduced to tackle class imbalance. Currently, it is widely adopted by practitioners and researchers alike.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: An exploration on how ViT can be used for vision.
A Unified Approach to Interpreting Model Predictions is a research paper that presents a comprehensive framework for interpreting the predictions made by machine learning models. The main goal of this approach is to provide a unified and systematic way to understand why a model makes specific predictions. The paper discusses various methods and techniques that can be applied across different types of models, such as linear models, decision trees, neural networks, etc., to gain insights into their decision-making processes. This approach is important because it helps address the "black-box" nature of complex models by making their predictions more transparent and interpretable.