bghira/SimpleTuner

LR scheduler: Error: 'Sine' object has no attribute '_step_count'

Mr-Ye-Cao opened this issue · 5 comments

Epoch 1/22, Steps: 0%| | 29/30000 [01:08<17:47:11, 2.14s/it, lr=0, step_loss=0.245]2024-11-10 19:28:47,024 [ERROR] Failed to get the last learning rate from the scheduler. Error: 'Sine' object has no attribute '_step_count'
Epoch 1/22, Steps: 0%| | 30/30000 [01:11<18:10:36, 2.18s/it, lr=0, step_loss=0.023]2024-11-10 19:28:49,022 [ERROR] Failed to get the last learning rate from the scheduler. Error: 'Sine' object has no attribute '_step_count'
Epoch 1/22, Steps: 0%| | 31/30000 [01:13<17:42:51, 2.13s/it, lr=0, step_loss=0.118]2024-11-10 19:28:51,085 [ERROR] Failed to get the last learning rate from the scheduler. Error: 'Sine' object has no attribute '_step_count'
Epoch 1/22, Steps: 0%| | 32/30000 [01:14<19:15:41, 2.31s/it, lr=0, step_loss=0.0314]^C

@bghira hi! i love ur github. it's super good and very useful.
also i want to know if there's a branch i can use for now to avoid this issue ?
i want to fine tune SD3XL on Ada6000

bghira commented

you can use a different LR scheduler, like 'constant'

yeah, i tried that before. but all differnt LR scheduler would give same error:
Epoch 1/22, Steps: 0%| | 29/30000 [01:08<17:47:11, 2.14s/it, lr=0, step_loss=0.245]2024-11-10 19:28:47,024 [ERROR] Failed to get the last learning rate from the scheduler. Error: 'Sine' object has no attribute '_step_count'

specifically, i tried to set: "LR_SCHEDULER=constant" in config.env

bghira commented

you should be using config.json

thanks a lot! i tried config.json this time. the default (which i guess is polynomial) still same error happens. but constant works this time!