On the Variance of the Adaptive Learning Rate and Beyond
翻译 - 自适应学习率的方差及超越
#计算机科学#This repository contains the results for the paper: "Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers"
#计算机科学#CS F425 Deep Learning course at BITS Pilani (Goa Campus)
#计算机科学#ADAS is short for Adaptive Step Size, it's an optimizer that unlike other optimizers that just normalize the derivative, it fine-tunes the step size, truly making step size scheduling obsolete, achiev...
#计算机科学#Google Street View House Number(SVHN) Dataset, and classifying them through CNN
#计算机科学#PyTorch/Tensorflow solutions for Stanford's CS231n: "CNNs for Visual Recognition"
Reproducing the paper "PADAM: Closing The Generalization Gap of Adaptive Gradient Methods In Training Deep Neural Networks" for the ICLR 2019 Reproducibility Challenge
#计算机科学#Toy implementations of some popular ML optimizers using Python/JAX
#计算机科学#A collection of various gradient descent algorithms implemented in Python from scratch
#计算机科学#This library provides a set of basic functions for different type of deep learning (and other) algorithms in C.This deep learning library will be constantly updated
#计算机科学#A compressed adaptive optimizer for training large-scale deep learning models using PyTorch
Modified XGBoost implementation from scratch with Numpy using Adam and RSMProp optimizers.
#计算机科学#The project aimed to implement Deep NN / RNN based solution in order to develop flexible methods that are able to adaptively fillin, backfill, and predict time-series using a large number of heterogen...
#计算机科学#Lookahead optimizer ("Lookahead Optimizer: k steps forward, 1 step back") for tensorflow
[Python] [arXiv/cs] Paper "An Overview of Gradient Descent Optimization Algorithms" by Sebastian Ruder
#计算机科学#From linear regression towards neural networks...
#算法刷题#Implementation of Adam Optimization algorithm using Numpy