tonytonyjan/jaro_winkler

Build error on Windows

Closed this issue · 11 comments

λ gem install jaro_winkler
Temporarily enhancing PATH to include DevKit...
Building native extensions.  This could take a while...
ERROR:  Error installing jaro_winkler:
        ERROR: Failed to build gem native extension.

    C:/Ruby200/bin/ruby.exe extconf.rb
creating Makefile

make "DESTDIR=" clean

make "DESTDIR="
generating jaro_winkler-i386-mingw32.def
compiling jaro_winkler.c
jaro_winkler.c: In function 'distance':
jaro_winkler.c:8:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
jaro_winkler.c:12:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
jaro_winkler.c:15:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
jaro_winkler.c:21:5: error: 'for' loop initial declarations are only allowed in C99 mode
jaro_winkler.c:21:5: note: use option -std=c99 or -std=gnu99 to compile your code
make: *** [jaro_winkler.o] Error 1

make failed, exit code 2

Gem files will remain installed in C:/Ruby200/lib/ruby/gems/2.0.0/gems/jaro_winkler-1.1.1 for inspection.
Results logged to C:/Ruby200/lib/ruby/gems/2.0.0/extensions/x86-mingw32/2.0.0/jaro_winkler-1.1.1/gem_make.out

Sheer coincidence that this gem was released just hours before I needed a Jaro-Winkler implementation! I had the same thought process about other gems (s1.split // -> s1.chars etc) and this looks excellent.

However it doesn't install for me, seems like you need to add a flag (-std=gnu99) for GCC.

Thanks for your feedback! It seems to be caused by for loop style. I guess replacing for(int i = 0; ... ) by int i; for(i = 0; ...) in distance.c should solve this issue. But I have to say sorry that this gem does not support Windows currently since I don't have a Windows environment right now 😢. (It runs on Mac and Unix, and fallback to pure Ruby version in JRuby.)

I need some time to get a windows, and I also appreciate you can send a pull request :).

I added $CFLAGS << ' -std=gnu99' to the 2nd line of extconf.rb and the gem installed successfully.
Requiring the gem shows this error though:

C:/Ruby200/lib/ruby/site_ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- jaro_winkler/jaro_winkler.so (LoadError)

The .so file is actually there, but isn't found. This might be my fault compiling the gem incorrectly, in any case $CFLAGS << ' -std=gnu99' is worth adding, probably with an if statement so it's only applied for Ruby on Windows.

I'll keep using AMatch for now, maybe you should add hotwater and amatch benchmark comparisons in future too. =]

Edit: This line shows you may have encountered that problem already, if you can apply that fix to Windows and add the C flag it should work just like JRuby. Didn't test that yet though.

I didn't knowhotwater and amatch before, thanks for noticing. I've added them to benchmark.

user system total real
jaro_winkler 0.370000 0.000000 0.370000 ( 0.370106)
fuzzystringmatch 0.140000 0.000000 0.140000 ( 0.147716)
hotwater 0.280000 0.000000 0.280000 ( 0.284125)
amatch 0.950000 0.000000 0.950000 ( 0.958045)

As the table shown, my work is slower than native version of fuzzaystringmatch and hotwater. After roughly reading their source code, it seems that they doesn't support UTF-8 string, in my submission, there is something to do with it because I spend more time to compute UTF-8 byte length. I'll try to enhance my project to be faster and still support UTF-8 string in the future.

Finally, I've made a patch to fallback to pure Ruby on windows, it supposed to solve this issue temporarily.

@Arcovion I wonder if you can help me trying this branch?

Tested that branch and all I had to change was if RUBY_PLATFORM['win']if Gem.win_platform?
Seems to work well!

Rehearsal --------------------------------------------------
jaro_winkler_r  14.570000   0.000000  14.570000 ( 14.756844)
jaro_winkler_c   0.374000   0.000000   0.374000 (  0.376021)
---------------------------------------- total: 14.944000sec

                     user     system      total        real
jaro_winkler_r  14.446000   0.000000  14.446000 ( 14.510830)
jaro_winkler_c   0.374000   0.000000   0.374000 (  0.378022)

version 1.2.7 released. Thanks for your help! @Arcovion 🍺

Great, 1.2.7 installs fine.
One more thing: maybe change :case_match to :ignore_case?. I think case match is a bit ambiguous and ignore case is more commonly used with regex etc.

You are right, :ignore_case is better naming. I've fixed in 1.2.8.

Thanks, good luck with this gem. 👍

BTW I found that these:

s1, s2 = 'henka', 'henkan'
p Amatch::JaroWinkler.new(s1).tap{ |m| m.scaling_factor = 0.1 }.match s2
p JaroWinkler.distance s1, s2, ignore_case: true, threshold: 0.5, weight: 0.1

are the same, AMatch (seems to) use 0.5 threshold and doesn't compare case by default.

@Arcovion I have no idea how amatch was implemented on account of its lack of readability of source code. In my opinion there is no threshold in amatch: ref.