apache/datasketches-cpp

datasketches does not work with python 3.9

MaheshGPai opened this issue · 12 comments

Datasketches module 3.2.0.1 seems to have been compiled with python 3.10 and does not work with python 3.9

root@907f428e7dc2:/temp# pip install datasketches --target .
Collecting datasketches
Downloading datasketches-3.2.0.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (474 kB)
|████████████████████████████████| 474 kB 658 kB/s
Collecting numpy
Downloading numpy-1.22.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)
|████████████████████████████████| 16.8 MB 7.3 MB/s
Installing collected packages: numpy, datasketches
Successfully installed datasketches-3.2.0.1 numpy-1.22.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 21.2.4; however, version 22.0.3 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

root@907f428e7dc2:/temp# ls -lrt
total 1056
drwxr-xr-x 2 root root 4096 Feb 12 06:16 numpy.libs
drwxr-xr-x 19 root root 4096 Feb 12 06:16 numpy
drwxr-xr-x 2 root root 4096 Feb 12 06:16 bin
drwxr-xr-x 2 root root 4096 Feb 12 06:16 numpy-1.22.2.dist-info
-rwxr-xr-x 1 root root 1049008 Feb 12 06:16 datasketches.cpython-310-x86_64-linux-gnu.so
drwxr-xr-x 3 root root 4096 Feb 12 06:16 tests
drwxr-xr-x 3 root root 4096 Feb 12 06:16 src
drwxr-xr-x 2 root root 4096 Feb 12 06:16 datasketches-3.2.0.1.dist-info

root@907f428e7dc2:/temp# python
Python 3.9.10 (main, Jan 26 2022, 20:56:53)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.

import datasketches
Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'datasketches'

As per the below link 3.9 is supported
https://www.piwheels.org/project/datasketches/

But the compiled library is that of 3.10 : datasketches.cpython-310-x86_64-linux-gnu.so

root@907f428e7dc2:/temp# ln -s datasketches.cpython-310-x86_64-linux-gnu.so datasketches.so

root@907f428e7dc2:/temp# python
Python 3.9.10 (main, Jan 26 2022, 20:56:53)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.

import datasketches
Traceback (most recent call last):
File "", line 1, in
ImportError: Python version mismatch: module was compiled for Python 3.10, but the interpreter version is incompatible: 3.9.10 (main, Jan 26 2022, 20:56:53)
[GCC 10.2.1 20210110].

Thanks for reporting this. I'm not currently sure what's going on, beyond confirming that the wheel seems weird.

I'll note that it works on on my mac running 3.9, so the title of the issue is overly broad. I'll try to investigate but nto sure how fast the turnaround will be.

Thanks @jmalkin. It works on my mac too. I hit this issue when I deployed my application using AWS lambda. It currently support only python 3.9. And this issue is seen even when I use python's 3.9 linux container. Have not tried this on windows.

@MaheshGPai Which specific python 3.9 container were you using for this? They seem to have lots of variants.

I haven't done much with python containers, so if you can share a minimal Dockerfile that'd be appreciated. I was clearly running into path issues when experimenting yesterday.

I got this working by building datasketches using python 3.9.10.

Hello,

I ran against this issue on another version, after looking into, its looks like the current package on pypi (3.2.0.1) is incorrect, the .so file in all wheel is build for python 3.10 (datasketches.cpython-310-x86_64-linux-gnu.so), so it doesn't load on other versions than 3.10.

Sorry, I haven't had a chance to get to this. Still quite backlogged.

The underlying issue is that cibuildwheel is only generating with 3.10, even when the naming suggests it should be otherwise. But I have no clue why this is happening. Presumably there's something incorrectly configured in the github action that generates the set of wheels.

Trying to cycle back to this finally. My best guess right now is that cmake is automatically finding a different version of python from what cibuildwheel expects. But that's pure speculation -- the only thing I can really confirm is that a newer version of cibuildwheel didn't fix it.

Ok, way too slow addressing this, but I think the artifact.zip at https://github.com/apache/datasketches-cpp/actions/runs/2261547153 (the link is at the bottom) should work. If anyone is still following this and able to try it, please let me know.

This should be resolved if you build yourself. For a pre-built artifact, we need to wait for the next release.

3.4.0 has been released on pypi which should finally close this out.