xlang-ai/Binder

Generation error: list index out of range

Eterance opened this issue · 5 comments

Hi! It's me again.🤣

When I run the annotate script on wikitq dataset (Here are my cmd and args)

python -u scripts/annotate_binder_program.py --dataset wikitq
--dataset_split test
--prompt_file templates/prompts/wikitq_binder.txt
--n_parallel_prompts 1
--n_processes 2
--max_generation_tokens 512
--temperature 0.4
--sampling_n 20
-v

Error generation error: list index out of range occurred in 76 samples. I took the first 5 samples (wtqid#nu-30, 208, 263, 279, 367) for debug, and found that the input still exceeded the length limit even when n_shot was reduced to 0.

image
scripts/annotate_binder_program.py

The list few_shot_prompt_list is empty and this step will throw the exception.

image
generation/generator.py

Will the missing results of these 76 samples have any effect on the execution stage? How should I solve this problem?

Thanks!

Ok, could you check the total tokens of prompt when the error is thrown?

That is to check the variable prompt is over 8001(the max tokens length restriction of code-davinci-002). If it is, then check why we are throwing this error. I remember we didn't write this error to throw.

Thanks for the reply!

I have provided some local variables value at left panel in the above screenshots, and the location of the breakpoint (yellow highlighted lines) is where the error was thrown.

As screenshot 1, in sample wtqid#nu-208, even n_shot was reduced to 0, total tokens of prompt ( len(tokenizer.tokenize(prompt)) ) is 10259, still over max_prompt_tokens = 7489 ( 8001 - 512 ).

As screenshot 2, because n_shot = 0 , in method generator.build_few_shot_prompt_from_file(), list few_shot_prompt_list have no element inside, and code few_shot_prompt_list[-1] = few_shot_prompt_list[-1].strip() will throw this exception.

The exception is caught by outer except in file scripts/annotate_binder_program.py, method worker_annotate(), as shown in the screenshot below.

image

I see, so these exceptions were caught.
Then it is right I think, I remembered wikitq indeed has some extremely long examples to use OpenAI codex model to do that. It won't affect the result too much actually.

Got it. Thank you for your reply!