Why is the installed Python library different depending on build method? Re: bioconda
cerebis opened this issue · 4 comments
I'll start this off by saying I want to get the Jellyfish Python bindings into the Jellyfish bioconda package. If I can't get this to work, I'll either have to move my project to another kmer counter or possibly reinvent the wheel and create an external project to make JF bindings (probably using Pybind). If I begin there, I will have to assess whether JF is the best place to start. I really can't afford the time though!
I find the simplest method of creating Python bindings (./configure --enable-python-binding
) to be problematic. In short, I get errors on import using this approach, whether I attempt import jellyfish
or import dna_jellyfish
.
The direct build approach as detailed in issue #134 works fine.
The import error is below. I haven't dived into SWIG to understand its process of loading dynamic libs. Seems like this might be as simple as a name or a search path issue?
>>> import dna_jellyfish
Traceback (most recent call last):
File "swig/python/dna_jellyfish.py", line 20, in swig_import_helper
File "/home/cerebis/miniconda3/envs/jf/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named '_dna_jellyfish'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "swig/python/dna_jellyfish.py", line 23, in <module>
File "swig/python/dna_jellyfish.py", line 22, in swig_import_helper
File "/home/cerebis/miniconda3/envs/jf/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_dna_jellyfish'
My question though is, why do the two methods generate different installation structures? Is one out of date with respect to the other?
The configure enable-binding method produces the following (import fails):
├── dna_jellyfish
│ ├── _dna_jellyfish.a
│ ├── _dna_jellyfish.so -> _dna_jellyfish.so.0.0.0
│ ├── _dna_jellyfish.so.0 -> _dna_jellyfish.so.0.0.0
│ ├── _dna_jellyfish.so.0.0.0
│ └── __init__.pyc
├── jellyfish.py
└── __pycache__
└── jellyfish.cpython-38.pyc
While the direct build in the SWIG folder produces (import succeeds):
├── dna_jellyfish-0.0.1-py2.7.egg-info
├── dna_jellyfish.py
├── dna_jellyfish.pyc
└── _dna_jellyfish.so
I do not know anything about jellyfish, but looking at the above outputs; one looks like python-3.8 (jellyfish.cpython-38.pyc
), while the other is python-2.7. So, if you work from within python-2 (which works) the python-3 import will fail.
the --enable-python-binding option will possibly take the python-version found by configure; while the direct method uses 'python setup.py' which would automatically call python-2 (I had to figure out the reverse; installing into python-3 where the --enable-python-binding would pick the python-2.7 tree on my box).
I have the same problem installing python bindings with the '--enable-python-binding'. I have checked and can confirm that jellyfish is installing the python bindings with the correct python version.
Similar to @cerebis, the only way I can get the dna_jellyfish python module to load correctly is by going to swig/python and installing manually.
The last jellyfish version which compiles and installs python bindings correctly is v2.2.6.
Dear @cerebis
Here are my steps for Jellyfish 2.3.0 (latest, as of today) Python 3.x binding:
Create a Python 3.x environment
$ curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ sh ./Miniconda3-latest-Linux-x86_64.sh -bufp /ibex/scratch/projects/c2072/work/jellyfish/python3.10
$ export PATH=/ibex/scratch/projects/c2072/work/jellyfish/python3.10/bin:$PATH
$ python --version
Python 3.10.9
Download the Jellyfish source code
$ wget https://github.com/gmarcais/Jellyfish/releases/download/v2.3.0/jellyfish-2.3.0.tar.gz
Pre-request for source code compilation
autoconf, automake, libool, gettext, pkg-config and yaggo.
Most of the system may not have Yaggo, if so,just download as a binary from here:
$ wget https://github.com/gmarcais/yaggo/releases/download/v1.5.10/yaggo
$ chmod +x yaggo
$ export PATH=/ibex/scratch/projects/c2072/work/jellyfish:$PATH
Compile Jellyfish
$ tar -xzvf jellyfish-2.3.0.tar.gz
$ cd Jellyfish/
$ autoreconf -i
$ ./configure --prefix=/ibex/scratch/projects/c2072/work/jellyfish/install --enable-python-binding
$ make
$ make install
For Python binding, do the following:
$ cd swig/python/
# Set the location of jellyfish-2.0.pc file, where it is available.
$ export PKG_CONFIG_PATH=/ibex/scratch/projects/c2072/work/jellyfish/install/lib/pkgconfig
$ python ./setup.py build
$ python ./setup.py install --prefix=/ibex/scratch/projects/c2072/work/jellyfish/install
Set the Jellyflish Python binder PATH
export PYTHONPATH=/ibex/scratch/projects/c2072/work/jellyfish/install/lib/python3.10/site-packages: /ibex/scratch/projects/c2072/work/jellyfish/install/lib/python3.10/site-packages/dna_jellyfish:$PYTHONPATH
Test the Jellyfish Python binding
$ python create_matrix.py --help
usage: create_matrix.py [-h] -a ACCNAME -j JFDUMP -c CONFIG [-k KMERSIZE] [-mc MINCOUNT] [-o OUTPUT]
Parse jellyfish dump file to check presence/absence of k-mers in diversity panel.