fail to run awq on Qwen2-7B-Instruct
gloritygithub11 opened this issue · 3 comments
gloritygithub11 commented
config file like following:
base:
seed: &seed 42
model:
type: Qwen2
path: /models/Qwen2-7B-Instruct
torch_dtype: auto
calib:
name: pileval
download: False
path: /models/src/llmc/tools/data/calib/pileval
n_samples: 128
bs: -1
seq_len: 512
preproc: pileval_awq
seed: *seed
eval:
# eval_pos: []
eval_pos: [pretrain, transformed]
name: wikitext2
download: False
path: /models/src/llmc/tools/data/eval/wikitext2
bs: 1
seq_len: 2048
quant:
method: Awq
weight:
bit: 4
symmetric: False
granularity: per_group
group_size: 128
save:
save_trans: False
save_lightllm: True
save_path: ./save
get error:
2024-08-23 10:39:41.133 | INFO | __main__:main:78 - wikitext2 ppl : 8.77077579498291
2024-08-23 10:39:41.133 | INFO | llmc.compression.quantization.base_blockwise_quantization:deploy:778 - -- deploy_real_quant_model start --
2024-08-23 10:39:41.133 | INFO | llmc.compression.quantization.base_blockwise_quantization:deploy:779 - quant_config : {'method': 'Awq', 'weight': {'bit': 4, 'symmetric': False, 'granularity': 'per_group', 'group_size': 128}}
2024-08-23 10:39:41.134 | INFO | llmc.models.base_model:replace_module_all:191 - Replace block index: 0/28
Traceback (most recent call last):
File "/models/src/llmc/llmc/__main__.py", line 160, in <module>
main(config)
File "/models/src/llmc/llmc/__main__.py", line 107, in main
blockwise_opt.deploy('real_quant')
File "/opt/conda/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/models/src/llmc/llmc/compression/quantization/base_blockwise_quantization.py", line 793, in deploy
self.model.replace_module_all(
File "/models/src/llmc/llmc/models/base_model.py", line 194, in replace_module_all
self.replace_module_block(module, block, block_idx, params_dict)
File "/models/src/llmc/llmc/models/base_model.py", line 210, in replace_module_block
self.replace_module_subset(module, block, subset, block_idx, params_dict)
File "/models/src/llmc/llmc/models/base_model.py", line 225, in replace_module_subset
M = module.new(m, **params_tmp_dict)
File "/opt/conda/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/models/src/llmc/llmc/compression/quantization/module_utils.py", line 544, in new
weight, scales, zeros = cls.quant_pack(module, w_q, quant_config)
File "/opt/conda/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/models/src/llmc/llmc/compression/quantization/module_utils.py", line 570, in quant_pack
weight, scales, zeros = w_q(module)
File "/models/src/llmc/llmc/compression/quantization/base_blockwise_quantization.py", line 41, in w_q
return wquantizer.real_quant_weight_dynamic(module.weight.data)
File "/models/src/llmc/llmc/compression/quantization/quant.py", line 457, in real_quant_weight_dynamic
if zeros != torch.tensor(0.0) and self.round_zp:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1904) of binary: /opt/conda/bin/python
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==2.0.0', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/opt/conda/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/opt/conda/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
/models/src/llmc/llmc/__main__.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2024-08-23_10:39:45
host : 20b728e84f30
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 1904)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Harahan commented
If you are in a hurry, I think you can try to fix the error by checking all the elements in zero tensor at the place throwing the error. We will fix it later.
gloritygithub11 commented
I just evaluate the llmc, to check whether it could run the kinds of quantize methods, espetially for the qwen2-72b models. Unfortunatly, it is not easy to run them through. Hope it could be fixed quickly. Thank you very much.