两张v100部署失败
Cocoalate opened this issue · 2 comments
本人环境
两张v100(32G*2)
cuda11.0
pytorch版本 1.7.1
由于pytorch版本比较低,无法支持量化版本,所以选择部署fnlp/moss-moon-003-sft这个模型,但是fp16精度会报以下错
File "/root/anaconda3/envs/mossgpu/lib/python3.8/site-packages/torch/tensor.py", line 547, in __rpow__ return torch.tensor(other, dtype=dtype, device=self.device) ** self RuntimeError: "pow" not implemented for 'Half'
所以只好改成
raw_model = MossForCausalLM._from_config(config, torch_dtype=torch.float32)
运行
python moss_cli_demo.py --model_name fnlp/moss-moon-003-sft --gpu 0,2
报错如下
Traceback (most recent call last): File "moss_cli_demo.py", line 48, in <module> raw_model = MossForCausalLM._from_config(config, torch_dtype=torch.float32) File "/root/anaconda3/envs/mossgpu/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1024, in _from_config model = cls(config, **kwargs) File "/data_a/keke/workspace/MOSS/models/modeling_moss.py", line 607, in __init__ self.transformer = MossModel(config) File "/data_a/keke/workspace/MOSS/models/modeling_moss.py", line 401, in __init__ self.h = nn.ModuleList([MossBlock(config) for _ in range(config.n_layer)]) File "/data_a/keke/workspace/MOSS/models/modeling_moss.py", line 401, in <listcomp> self.h = nn.ModuleList([MossBlock(config) for _ in range(config.n_layer)]) File "/data_a/keke/workspace/MOSS/models/modeling_moss.py", line 256, in __init__ self.mlp = MossMLP(inner_dim, config) File "/data_a/keke/workspace/MOSS/models/modeling_moss.py", line 235, in __init__ self.fc_in = nn.Linear(embed_dim, intermediate_size) File "/root/anaconda3/envs/mossgpu/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 78, in __init__ self.weight = Parameter(torch.Tensor(out_features, in_features)) File "/root/anaconda3/envs/mossgpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 796, in __setattr__ self.register_parameter(name, value) File "/root/anaconda3/envs/mossgpu/lib/python3.8/site-packages/accelerate/big_modeling.py", line 108, in register_empty_parameter module._parameters[name] = param_cls(module._parameters[name].to(device), **kwargs) RuntimeError: CUDA out of memory. Tried to allocate 576.00 MiB (GPU 0; 31.75 GiB total capacity; 30.01 GiB already allocated; 548.00 MiB free; 30.02 GiB reserved in total by PyTorch)
请问有大神知道怎么调么
RuntimeError: CUDA out of memory. Tried to allocate 576.00 MiB (GPU 0; 31.75 GiB total capacity; 30.01 GiB already allocated; 548.00 MiB free; 30.02 GiB reserved in total by PyTorch)
爆显存了
RuntimeError: CUDA out of memory. Tried to allocate 576.00 MiB (GPU 0; 31.75 GiB total capacity; 30.01 GiB already allocated; 548.00 MiB free; 30.02 GiB reserved in total by PyTorch) 爆显存了
谢谢 我已经调通了 还是用的fp16