seomoz/simhash-py

Building on Windows

Opened this issue · 1 comments

rth commented

I was wondering if anyone is using simhash-py on Windows?
As far as I understand on Windows, the 64bit gcc (mingw64) is still experimental and it's better better to use Microsoft compilers, and the compiler used to build Python extensions also depends on the Python version.

I have done some tests for building simhash-py with conda in the rth:win-ci branch (is it based on top of PR #27). The output of the builds is available in Appveyor CI and the situation is the following. For all python versions, because of a bug in distutils on Win. 64bit when using Visual Studio ([1], [2]), setuptools has to be used instead, then

  • Python 2.7: Build fails as the compiler does not find the stdint.h which is not included in Visual Studio 2008 used to compile python 2.7-3.3 extensions.

  • Python 3.4: with Visual Studio 2010, there is a syntax error (probably some compiler flag is missing, I'm not very familiar with VS compiler flags),

    c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\Include\xlocale(323) : warning C4530: C++ exception handler used, but unwind semantics are not enabled. Specify /EHsc
    simhash\simhash-cpp\src\permutation.cpp(37) : warning C4267: 'initializing' : conversion from 'size_t' to 'int', possible loss of data
    simhash\simhash-cpp\src\permutation.cpp(101) : error C2143: syntax error : missing ',' before ':'
    simhash\simhash-cpp\src\permutation.cpp(101) : error C2530: 'choice' : references must be initialized
    simhash\simhash-cpp\src\permutation.cpp(102) : error C2143: syntax error : missing ';' before '{'
    simhash\simhash-cpp\src\permutation.cpp(104) : error C2143: syntax error : missing ',' before ':'
    simhash\simhash-cpp\src\permutation.cpp(105) : error C2143: syntax error : missing ';' before '{'
    simhash\simhash-cpp\src\permutation.cpp(147) : warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)
    
  • Python 3.5: with Visual Studio 2015, build passes, but it looks like there is an infinite loop when running test_basic (test.TestFindAll). Because TestFindAll mostly directly calls the simhash-cpp code, I guess this is more a "how to build simhash-cpp with VS 2014" issue.

I'm mostly interested in making at least PY 3.5 work on Windows (and don't have much experience with building C++ code on Windows), @dlecocq would you have any suggestions on how the above issue for PY 3.5 could be debugged? Thanks!

My find_all function keeps running forever I am trying to implement find_all with just three hashes , block size 6 and number of differing bits to be 3.
Running it on windows 10 and python version 3.7.4