CodeT5 input for APPS/MBPP problems
ysymyth opened this issue · 2 comments
Hi, I wonder what's the exact input formats for APPS/MBPP problems to be fed into CodeT5-large-ntp-py or CodeT5-finetuned_CodeRL? I tried """{Problem}""" but it doesn't work well, generating a lot of comments or natural language outputs.
Would appreciate an example for each dataset as they are not found in repo/paper. Thanks!
We followed the default formats that are used in these benchmarks. For APPS, the format is defined here in the code:
CodeRL/datasets/apps_dataset.py
Line 308 in 51db4ff
where q_str
is the question description, s_str
is the starter code (if any), answer_type
is the type of problems in APPS (e.g. Call-based/ Standard-input).
For MBPP, please refer to the original paper for the input format.
We followed the default formats that are used in these benchmarks. For APPS, the format is defined here in the code:
CodeRL/datasets/apps_dataset.py
Line 308 in 51db4ff
where
q_str
is the question description,s_str
is the starter code (if any),answer_type
is the type of problems in APPS (e.g. Call-based/ Standard-input).For MBPP, please refer to the original paper for the input format.
hi,where is the original paper (the MBPP).I want to test model in the mbpp,but i dont konw how to got it