danielkunin/neural-mechanics

Post-Submission TODOs

jvrsgsty opened this issue · 0 comments

Empirics

  • TPU code cleanup (Merge #1)
  • Training loss and accuracy plots to connect to generalization
  • General code cleanup: train, extract, cache, plot
  • Cleanup and update README
  • Scale up to ImageNet on TPU (involves debugging some of the mutiprocessing stuff to take full advantage of hardware)
  • Get more data on a wider sweep of hyperparameters on TinyImageNet
  • Custom Optimizers (Save botht he update as well as the gradient squared? Have a way to access the gradient over time, after the fact)
    • SGD
    • Momentum
    • ADAM

Theory

  • Extend theory to Momentum
  • Complete backward error analysis derivation of modified gradient flow
  • Cleanup Symmetries notation and equations
  • Connect to generalization