Patch for /trunk/word2vec.c
GoogleCodeExporter opened this issue · 0 comments
GoogleCodeExporter commented
Patch for bug, which caused discarding the last word of vocab after sorting if
there was no newline character in the input file.
If there is no newline in the input file, vocab[0].cn==0, which is ignored in
sorting, but is not in the for loop, where it decrements the vocab_size and
frees the memory of the last word. However, it still computes the hash for the
last word if its count is greater than min_count. Also the realloc needs to
allocate only vocab_size * sizeof(struct vocab_word).
Original issue reported on code.google.com by FerroMrkva
on 5 Feb 2014 at 11:24
Attachments: