[BUG/Help] <title>复现ptuning微调时出现RuntimeError: "bernoulli_scalar_cpu_" not implemented for 'Half'
ysqfirmament opened this issue · 3 comments
ysqfirmament commented
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
进行微调的时候,尝试复现ADGEN数据集任务,在运行bash train.sh过程中出现此错误
执行
import torch
print(torch.cuda.is_available())
得到的结果为True
C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\transformers\optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
input_ids [5, 65421, 61, 67329, 32, 98339, 61, 72043, 32, 65347, 61, 70872, 32, 69768, 61, 68944, 32, 67329, 64103, 61, 96914, 130001, 130004, 5, 87052, 96914, 81471, 64562, 65759, 64493, 64988, 6, 65840, 65388, 74531, 63825, 75786, 64009, 63823, 65626, 63882, 64619, 65388, 6, 64480, 65604, 85646, 110945, 10, 64089, 65966, 87052, 67329, 65544, 6, 71964, 70533, 64417, 63862, 89978, 63991, 63823, 77284, 88473, 64219, 63848, 112012, 6, 71231, 65099, 71252, 66800, 85768, 64566, 64338, 100323, 75469, 63823, 117317, 64218, 64257, 64051, 74197, 6, 63893, 130005, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]
inputs ▒▒▒▒#▒▒*▒▒▒▒#▒▒▒▒*▒▒▒#▒Ը▒*ͼ▒▒#▒▒▒▒*▒▒▒▒#▒▒▒ȿ▒ ▒▒▒ɵ▒▒▒▒ȿ▒▒▒▒▒▒▒▒▒▒▒▒▒۲▒▒▒,▒▒▒▒ʱ▒д▒▒˵▒▒▒ͷ▒▒▒▒▒Ͼ▒▒ô▒ʱ▒▒,˭▒▒▒ܴ▒▒▒▒ȳ▒2▒▒Ч▒▒▒▒▒ɵĿ▒▒▒,▒▒Ȼ▒▒▒▒▒▒С▒▒▒ְ▒▒▒▒▒▒▒▒▒▒▒▒▒Ȼ▒▒▒▒▒▒,▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒а▒▒▒▒ա▒ϵ▒▒▒▒▒▒▒▒▒▒▒▒ƿ▒▒▒,▒▒
label_ids [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 130004, 5, 87052, 96914, 81471, 64562, 65759, 64493, 64988, 6, 65840, 65388, 74531, 63825, 75786, 64009, 63823, 65626, 63882, 64619, 65388, 6, 64480, 65604, 85646, 110945, 10, 64089, 65966, 87052, 67329, 65544, 6, 71964, 70533, 64417, 63862, 89978, 63991, 63823, 77284, 88473, 64219, 63848, 112012, 6, 71231, 65099, 71252, 66800, 85768, 64566, 64338, 100323, 75469, 63823, 117317, 64218, 64257, 64051, 74197, 6, 63893, 130005, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100]
labels <image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100> ▒▒▒ɵ▒▒▒▒ȿ▒▒▒▒▒▒▒▒▒▒▒▒▒۲▒▒▒,▒▒▒▒ʱ▒д▒▒˵▒▒▒ͷ▒▒▒▒▒Ͼ▒▒ô▒ʱ▒▒,˭▒▒▒ܴ▒▒▒▒ȳ▒2▒▒Ч▒▒▒▒▒ɵĿ▒▒▒,▒▒Ȼ▒▒▒▒▒▒С▒▒▒ְ▒▒▒▒▒▒▒▒▒▒▒▒▒Ȼ▒▒▒▒▒▒,▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒а▒▒▒▒ա▒ϵ▒▒▒▒▒▒▒▒▒▒▒▒ƿ▒▒▒,▒▒<image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100>
0%| | 0/3000 [00:00<?, ?it/s]03/23/2024 23:23:53 - WARNING - transformers_modules.chatglm-6b-int4.modeling_chatglm - `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...
Traceback (most recent call last):
File "D:\GLM\ChatGLM-6B-main\ptuning\main.py", line 430, in <module>
main()
File "D:\GLM\ChatGLM-6B-main\ptuning\main.py", line 369, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 1635, in train
return inner_training_loop(
File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 1904, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 2647, in training_step
loss = self.compute_loss(model, inputs)
File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 2679, in compute_loss
outputs = model(**inputs)
File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 1190, in forward
transformer_outputs = self.transformer(
File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 930, in forward
past_key_values = self.get_prompt(batch_size=input_ids.shape[0], device=input_ids.device,
File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 878, in get_prompt
past_key_values = self.dropout(past_key_values)
File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\dropout.py", line 58, in forward
return F.dropout(input, self.p, self.training, self.inplace)
File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\functional.py", line 1266, in dropout
return _VF.dropout_(input, p, training) if inplace else _VF.dropout(input, p, training)
RuntimeError: "bernoulli_scalar_cpu_" not implemented for 'Half'
0%| | 0/3000 [00:00<?, ?it/s]
Expected Behavior
No response
Steps To Reproduce
将ADGEN数据集文件夹放入ptuning文件夹
在ptuning文件夹运行bash trains.sh
出现错误
Environment
- OS: windows11
- Python:3.10
- Transformers: 4.27.1
- PyTorch: 2.2.1+cu121
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True
Anything else?
No response
ysqfirmament commented
是不是我的电脑跑不动?
Zylsjsp commented
是不是我的电脑跑不动?
我觉得你应该先讲一下你显卡的型号显存 同时查一下自己的显卡是不是支持模型量化(我记得在根目录的readme有提示)
默认配置是量化到int4的 显存需求很低 而且你提示也不是oom 应该可以排除爆显存的可能(至少这一步报错的时候还不是)
我有个建议是你去把量化的参数改成fp16的(直接删掉也行) 不量化模型只是显存占用大些 速度能快好多 一是因为加载过程不用量化 二是fp16训练推理最快(我的测试中训练时间fp16<<int4<int8)
顺便一提 我的配置是4张tesla t4 16g显存 能跑所有p-tuning但是全量微调会爆显存 软件版本是
- Python:3.9.19
- Transformers: 4.27.1
- PyTorch: 1.3.1+cu116
- CUDA: 11.6
因为服务器没办法更新 另一个微调的环境需要transformers>=4.30
我还花了很久解决依赖问题依赖地狱 所以对依赖版本印象特别深
实在不行你可以试试和我的配置保持一致 管他那么多先跑通再说
顺便我是Linux跑的 要不你也试试找个服务器