padding left some token causing compile error
Closed this issue · 2 comments
I think after some lately update in the file automodel.py
, it broke the code generated
Testcase: HumanEval_0_has_close_elements
Model used: Code Llama 13B Instruct
Prompt:
//Check if in given array of numbers, are any two numbers closer to each other than
// given threshold.
// >>> has_close_elements([1.0, 2.0, 3.0], 0.5)
// false
// >>> has_close_elements([1.0, 2.8, 3.0, 4.0, 5.0, 2.0], 0.3)
// true
function has_close_elements(numbers: number[], threshold: number): boolean {
Generated code:
an {
for (let i = 0; i < numbers.length; i++) {
for (let j = i + 1; j < numbers.length; j++) {
if (Math.abs(numbers[i] - numbers[j]) < threshold) {
return true;
}
}
}
return false;
}
The code an {
is redundant now, the whole pipeline worked before (around 1 month ago) but currently on every models i test it's always return redundant code with around 4-5 token padding left from the beginning.
I switched back to automodel.py
in the commit 334b49c4d3f4b9c4082b7c724b1d6095075cc13b
(auto bf16) and it working fine again
Can you guys please double check
Yeah sorry I broke it yesterday. I have a fix that I’ll push as soon as I’m back from the vet.
Btw unless you really are digging into some MultiPL-E details, you may prefer the bigcode evaluation harness. (I prefer MultiPL-E for a couple of reasons, but largely because we wrote it .)