Tokenizer errors out when inferencing llama2
navidsam opened this issue · 1 comments
I was getting failed read
error in this line in run.c
when I ran ./run llama2_7b.bin
(code snippet below if you don't wanna click the link):
// in build_tokenizer
int len;
for (int i = 0; i < vocab_size; i++) {
if (fread(t->vocab_scores + i, sizeof(float), 1, file) != 1) { fprintf(stderr, "failed read\n"); exit(EXIT_FAILURE);}
Note that I followed the README instructions line by line to get to that stage. The fix that I found is not ideal but seems to be working. That is to simply override the read vocab_size
(which seems to be 32016
coming from llama2
configuration) with a value of 32000
. This seems to be the size that the current tokenizer.bin
in the repo has.
void build_tokenizer(Tokenizer* t, char* tokenizer_path, int vocab_size) {
// i should have written the vocab_size into the tokenizer file... sigh
vocab_size = 32000; // my hacky fix
t->vocab_size = vocab_size;
Would love to hear what others might be thinking about this or if they have ever run into this issue. I'd be surprised if no one ran into this before 😅 or else I'm doing something wrong.
For me the tokenizer.bin worked fine with the llama2 7B base after exporting it into legacy format.