kozistr/pytorch_optimizer
optimizer & lr scheduler & loss function collections in PyTorch
PythonApache-2.0
Issues
- 0
Support Adopt
#289 opened by sdbds - 0
schedulefree with palm schedule
#286 opened by sdbds - 0
Add torch._foreach_ for speed up
#287 opened by sdbds - 4
Request to add 4-bit AdamW
#208 opened by LiutongZhou - 4
SophiaH implementation is not correct
#278 opened by Vectorrent - 8
sophiah in https://github.com/booydar/LM-RMT
#194 opened by robotzheng - 1
- 0
Support SOAP
#274 opened by sdbds - 0
Support SaRA
#273 opened by sdbds - 2
Add WeLore
#262 opened by rotem154154 - 1
The `create_optimizer()` function fails when Lookahead arguments are provided
#270 opened by Vectorrent - 4
- 6
- 0
- 6
- 1
Add Adam-Mini
#246 opened by sdbds - 0
Add StableAdamW
#250 opened by tfriedel - 0
Add WSD LR scheduler
#247 opened by sdbds - 0
- 1
Modified AdaFactor by ViT paper
#236 opened by aliencaocao - 1
FAdam
#241 opened by tranquan687 - 0
- 0
Wrong typing of `reg_noise`
#239 opened by michaldyczko - 1
Plans for pytorch_optimizer v3
#164 opened by kozistr - 1
Ranger sign inversion
#232 opened by i404788 - 0
ScheduleFree
#230 opened by sdbds - 0
- 1
- 1
- 1
- 3
Aida optimizer
#220 opened by okbalefthanded - 2
[Feature request]REX LR scheduler
#217 opened by sdbds - 2
FR: Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term (WSAM)
#213 opened by LiutongZhou - 3
Ranger21 has undocumented required arguments
#214 opened by Vectorrent - 1
ipex failed for Adan from pytorch_optimizer
#210 opened by ldv1 - 6
Empty Docs Sections
#204 opened by InfluxOW - 4
- 2
Adding the CAME optimizer
#195 opened by LiutongZhou - 5
sophiah bug
#193 opened by robotzheng - 2
- 1
LOMO: LOw-Memory Optimization
#187 opened by sdbds - 2
- 2
get_chebyshev_schedule not working
#168 opened by aliencaocao - 2
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
#173 opened by redknightlois - 8
- 4
- 0
Implement optimizers
#152 opened by kozistr - 0
Implement optimizers
#138 opened by kozistr - 1
- 3
Adafactor: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
#131 opened by Bing-su