Types casting error when using the demo commands.
ds-ssj opened this issue · 4 comments
Hi.
When I use the following commands in README:
CUDA_VISIBLE_DEVICES=0 python -m src.benchmark --num-data 1024 --strategy seqsch --vbs --fcr --lora-path ./ckpts/vicuna-response-length-perception-module
An error accurs:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/mnt/workspace/ssj/Sequence-Scheduling/src/benchmark.py", line 109, in <module>
result = benchmark(
File "/mnt/workspace/ssj/Sequence-Scheduling/src/benchmark.py", line 34, in benchmark
out = model(
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/mnt/workspace/ssj/Sequence-Scheduling/src/generate.py", line 254, in __call__
out = self.generate_group(prompt, **kwargs)
File "/mnt/workspace/ssj/Sequence-Scheduling/src/generate.py", line 185, in generate_group
length = predictor.predict_length(
File "/mnt/workspace/ssj/Sequence-Scheduling/src/utils.py", line 231, in predict_length
ret = [int(s.strip()) for s in outputs]
File "/mnt/workspace/ssj/Sequence-Scheduling/src/utils.py", line 231, in <listcomp>
ret = [int(s.strip()) for s in outputs]
ValueError: invalid literal for int() with base 10: '100 tokens.'
I observe the values of the outputs
, like this
['100', '1', '100', '100', '4', '5', '5', '100', '4', '4', '100', '100', '100', '100', '100', '1', '100', '100', '10', '100', '100', '100', '1', '3', '100', '100', '100', '100', '100', '4', '4', '10', '1', '1', '100 tokens.', '10', '100', '100', '200', '100', '10', '100', '5', '150', '140', '100', '1', '100', '4', '1', '100', '100', '100', '100', '1000', '150', '100', '100', '5', '4', '100', '10', '10', '100', '100', '1', '100', '4', '1', '100', '1', '4', '10', '10', '100', '3', '100', '100', '100', '100', '100', '100', '4', '10', '100', '1', '1', '140', '4', '5', '1', '4', '100', '500', '10', '1', '10', '5', '1', '100', '100', '1', '100', '100', '10,000 images', '5', '12', '4', '4', '4', '10', '200', '100', '4', '3', '5', '10', '1', '100', '100', '100', '4', '10', '100', '150', '10', '100', '10']
Is the outputs
array is correct? I use all of config files the repo provided. The lora-path
is downloaded from HF as mentioned in README.
Thanks!
I think the output seems correct. Your problem seems to be a corner case that we did not enter in our experiments: our model is supposed to generate only a number, but it seems sometimes it will append some additional tokens.
For a quick fix, since it is not very open, you can do a try-except like:
ret = []
for s in outputs:
try:
v = int(s.strip())
ret.append(v)
except:
ret.append(100)
One possible reason why this happens is maybe different lib version can lead to unstability.
One possible reason why this happens is maybe different lib version can lead to unstability.
Thank you for your assistance. I understand that sometimes a legitimate output might be followed by an additional token.
May I ask which library would cause some difference in this output behavior?
Not sure. The most likely ones are transformers
and torch
.