/ZO-AdaMM-vs-FO-AdaMM-convergence-and-minima-shape-comparison

Implementation and comparison of zero order vs first order method on the AdaMM (aka AMSGrad) optimizer: analysis of convergence rates and minima shape

Primary LanguageJupyter Notebook

Stargazers