Unofficial minimal implementation of MeZO optimizer.
For research purposes.
Fine-Tuning Language Models with Just Forward Passes.
Official repo (here)
Simply copy paste mezo.py
in your repository and import the optimizer.
from mezo import MeZO
opt = MeZO(torch.optim.SGD(model.parameters(), lr=0.05), eps=1e-3)
opt = MeZO(torch.optim.AdamW(model.parameters(), lr=0.005), eps=1e-3)
Work in progress. May have bugs. Use at your discretion.