Post-Submission TODOs
jvrsgsty opened this issue · 0 comments
jvrsgsty commented
Empirics
- TPU code cleanup (Merge #1)
- Training loss and accuracy plots to connect to generalization
- General code cleanup: train, extract, cache, plot
- Cleanup and update README
- Scale up to ImageNet on TPU (involves debugging some of the mutiprocessing stuff to take full advantage of hardware)
- Get more data on a wider sweep of hyperparameters on TinyImageNet
- Custom Optimizers (Save botht he update as well as the gradient squared? Have a way to access the gradient over time, after the fact)
- SGD
- Momentum
- ADAM
Theory
- Extend theory to Momentum
- Complete backward error analysis derivation of modified gradient flow
- Cleanup Symmetries notation and equations
- Connect to generalization