thu-ml/low-bit-optimizers

doesn't work directly with HF transformers trainer.

winglian opened this issue · 0 comments

  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/transformers/trainer.py", line 1779, in train
    return inner_training_loop(
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/transformers/trainer.py", line 2176, in _inner_training_loop
    self.optimizer.step()
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/accelerate/optimizer.py", line 145, in step
    self.optimizer.step(closure)
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
    return wrapped(*args, **kwargs)
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/optim/optimizer.py", line 373, in wrapper
    out = func(*args, **kwargs)
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/lpmm/optim/adamw.py", line 230, in step
    _single_tensor_adamw4bit(**kwargs)
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/lpmm/optim/adamw.py", line 426, in _single_tensor_adamw4bit
    qx, gen = vectorwise_quant(exp_avg, qmap=exp_avgs_qmap[i], shape=param.shape, **exp_avg_qmetadata)
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/lpmm/functional.py", line 53, in vectorwise_quant
    qx = nonlinear_quant(qx, qmap, b, round_type=kwargs['round_type'])
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/lpmm/functional.py", line 369, in nonlinear_quant
    idx = real_nonlinear_quant(qx, qmap, b, False)
  File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/lpmm/functional.py", line 363, in real_nonlinear_quant
    return ext_quantization.pack_nonlinear(grouped_qx, qmap, b, stochastic)
RuntimeError: The type of data is not kFloat32 or kFloat16!