bleqdyce/word2vec

Build for Mac?

GoogleCodeExporter opened this issue · 14 comments

What steps will reproduce the problem?
On a Mac:
1. svn checkout http://word2vec.googlecode.com/svn/trunk/
2. make

What is the expected output?
Binary is emitted.

What do you see instead?
pindari:word2vec pmonks$ make
gcc word2vec.c -o word2vec -lm -pthread -Ofast -march=native -Wall 
-funroll-loops -Wno-unused-result
cc1: error: invalid option argument ‘-Ofast’
cc1: error: unrecognized command line option "-Wno-unused-result"
word2vec.c:1: error: bad value (native) for -march= switch
word2vec.c:1: error: bad value (native) for -mtune= switch
make: *** [word2vec] Error 1
pindari:word2vec pmonks$

What version of the product are you using?
SVN r32

On what operating system?
Mac OSX 10.8.4

Original issue reported on code.google.com by peter.mo...@alfresco.com on 15 Aug 2013 at 5:45

Updating gcc will fix this issue: e.g., 
http://superuser.com/questions/517218/how-do-i-install-gcc-4-7-2-on-os-x-10-8. 
(You'll probably have other issues after that, though. I still can't get this 
to work on OS X.)

Original comment by jesse.cz...@gmail.com on 15 Aug 2013 at 6:34

Got it to work with the following steps:

1) Update gcc to 4.7: 
http://superuser.com/questions/517218/how-do-i-install-gcc-4-7-2-on-os-x-10-8
2) Change "-march=native" to "-msse4.2" in makefile
3) Add "-I/usr/include/sys" to makefile "CFLAGS = " statement

Original comment by jesse.cz...@gmail.com on 15 Aug 2013 at 7:22

It compiles if you remove the -Ofast, -Wno-unused-result and -march gcc 
options, and replace malloc.h with stdlib.h in the include statements. There 
might be a better way, though.

Original comment by eaton....@gmail.com on 15 Aug 2013 at 8:01

Thanks eaton...@gmail.com - that appears to have worked (binaries run, at least 
when not provided with arguments).

Original comment by peter.mo...@alfresco.com on 15 Aug 2013 at 8:15

This is my modified build for mac. It worked with 8text.zip (I suggest manually 
downloading/extracting it. since the script uses wget to download and it cannot 
find it on mac.)

Original comment by akshayub...@gmail.com on 17 Aug 2013 at 5:57

Attachments:

./distance in this mac package works with the bin generated from text8, but not 
with the freebase bin file. Just me or everyone?

Original comment by libins...@gmail.com on 17 Aug 2013 at 1:38

A slightly better way to go about it. If you replace gcc with clang, which is 
what osx is sticking to now, then you just switch -Ofast with -O2 and 
-Who-unused-result with -Wunused-result.

Original comment by dluna...@gmail.com on 18 Aug 2013 at 11:20

I had to do the following to get the demos to work on my 10.8.2 Hackintosh:

* in the makefile:
    * replace 'gcc' with 'clang'
    * replace '-Ofast' with '-O2'
    * replace '-Who-unused-result' with '-Wunused-result'

* where needed in the *.c files, replace '#include <malloc.h>' with '#include 
<stdlib.h>'

* intall 'wget' (I used the instructions at 
http://osxdaily.com/2012/05/22/install-wget-mac-os-x/)

If the files text8 and text8-phrase do not appear after running one of the 
scripts, you can download them from http://mattmahoney.net/dc/text8.zip.

This looks like really cool technology!

Original comment by GreggInCA@gmail.com on 22 Aug 2013 at 2:55

Instead of getting or building wget, why not use curl.
Replace in for example demo-word.sh the wget for:
curl -o text8.gz http://mattmahoney.net/dc/text8.zip

Original comment by e...@vanstegeren.com on 27 Aug 2013 at 1:08

CFLAGS = -lm -lc -pthread -O2 -msse4.2 -Wall -funroll-loops -Wunused-result

and replaced or removed all (where already present):
#include <malloc.h>
with:
#include <stdlib.h> 

Original comment by florian.leitner on 18 Nov 2013 at 2:27

[deleted comment]
after having compiled on mavericks (simple malloc.h substitution to stdlib.h 
and nothing changed in compiler parameters) word2vec works well with 
demo-word.sh and demo-phrases.sh, but not with demo-word-accuracy. I get a 
segfault at line 7, sunning only line 7 (as i already have vectors.bin used in 
demo-word.sh) i get:
./compute-accuracy vectors.bin 30000 < questions-words.txt
capital-common-countries:
Segmentation fault: 11
Any idea?

Original comment by piero.mo...@gmail.com on 20 Nov 2013 at 5:03

It could be related to the non-portable call to gzip when unpacking the test 
data. In fact, if you look at the demo scripts and change the line with gzip to 
the line with unzip, the demo should run.

  #gzip -d text8.gz -f
  unzip -c -d text8.gz > text8

Regarding the malloc/stdlib error, you can add block of directives to handle 
whether __APPLE__ has been defined. Something like below should work with 
distance.c, word-analogy.c, and compute-accuracy.c:

#ifdef __APPLE__
#include <sys/malloc.h>
#include <stdlib.h>
#else
#include <malloc.h>
#endif

Best of luck!
Paul


Original comment by paulrigo...@gmail.com on 13 Dec 2013 at 10:51

I had issues on OS X 10.9.2, and my fix was to install gcc 4.7 using macports

1. "sudo port install gcc47" This will install gcc as gcc-mp-47 so you will 
need to change the first line in the makefile to refer to that instead of just 
"gcc".
2. Some libraries are needed from /usr/include/sys so you have to add to the 
CFLAGS statement in the makefile "-I/usr/include/sys"
3. Unfortunately, the header file time.h in /usr/include/sys is not the one you 
want because it doesn't define clock_t type. So we have to explicitly refer to 
"#include </usr/include/time.h>" in the word2vec.c file and any others that 
declare variables of clock_t type.

Hopefully this will save other time
Eddie

Original comment by edeussil...@gmail.com on 28 Mar 2014 at 10:38