tzmartin/word2vec

Patch for word2vec.c: Fix memory leaks

Opened this issue · 4 comments

word2vec does not free allocated objects correctly. It also reads freed objects.
Attached patch fixes this issue.

Original issue reported on code.google.com by tetsuo.s...@gmail.com on 17 Aug 2013 at 6:23

Attachments:

FYI I had to roll back this patch because when using demo-word.sh it was 
producing a vector.bin file that was unreadable by the distance utility.

Possibly related (but I doubt it): this was after also changing include headers 
for Mac OS X compilation (see 
https://github.com/dav/word2vec/commit/94bc3d7860670053d4785a5aa07737235ae4e272)

btw, I think you meant DestroyVocab not DestoryVocab

Original comment by dav.yagi...@gmail.com on 17 Aug 2013 at 8:01

Thanks dav.yagi...@gmail.com for comments and typos. The fixed patch attached.

There was a bug in SortVocab in which the special word "</s>" was freed.
On Ubuntu 12.04.2, demo-word.sh works fine. I think it works fine on OS X as 
well.

I checked your commits on the github repository, and I think the changes are 
not relate to this issue. It is related to compilation issues on OS X because 
of missing header "malloc.h".

Original comment by tetsuo.s...@gmail.com on 17 Aug 2013 at 10:37

Attachments:

New patch seems to work fine, thanks.

Original comment by dav.yagi...@gmail.com on 18 Aug 2013 at 6:45

Where is it reading a freed memory?
The added error messages assume the error, e.g. no such file or directory.
Is it worth the bother of freeing a lot of this memory when the OS will do it 
much more efficiently when the program exits soon after?

Original comment by ralph.co...@gmail.com on 23 Aug 2013 at 9:30