ERROR: Could not build wheels for pandas, scikit-learn, which is required to install pyproject.toml-based projects
davidgilbertson opened this issue · 2 comments
🐛 Bug
To Reproduce
pip install lightning-flash[tabular]
has a lot of errors. There's thousands of lines of errors so I'm not sure which parts to share.
A sample.
building 'pandas._libs.algos' extension
creating build/temp.linux-x86_64-cpython-310
creating build/temp.linux-x86_64-cpython-310/pandas
creating build/temp.linux-x86_64-cpython-310/pandas/_libs
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -DNPY_NO_DEPRECATED_API=0 -I./pandas/_libs -Ipandas/_libs/src/klib -I/tmp/pip-build-env-0xcqjj9j/overlay/lib/python3.10/site-packages/numpy/core/include -I/home/davidg/.virtualenvs/learning/include -I/usr/include/python3.10 -c pandas/_libs/algos.c -o build/temp.linux-x86_64-cpython-310/pandas/_libs/algos.o
pandas/_libs/algos.c:42:10: fatal error: Python.h: No such file or directory
42 | #include "Python.h"
| ^~~~~~~~~~
compilation terminated.
error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pandas
Building wheel for scikit-learn (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for scikit-learn (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [1003 lines of output]
Partial import of sklearn during the build process.
<string>:116: DeprecationWarning:
`numpy.distutils` is deprecated since NumPy 1.23.0, as a result
of the deprecation of `distutils` itself. It will be removed for
Python >= 3.12. For older Python versions it will remain present.
It is recommended to use `setuptools < 60.0` for those Python versions.
For more details, see:
https://numpy.org/devdocs/reference/distutils_status_migration.html
INFO: C compiler: x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC
And, an interesting part...?
In file included from sklearn/svm/src/libsvm/libsvm_template.cpp:6:
sklearn/svm/src/libsvm/svm.cpp: In function ‘const char* svm_check_parameter(const svm_problem*, const svm_parameter*)’:
sklearn/svm/src/libsvm/svm.cpp:3130:11: warning: ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings]
3130 | msg = "Invalid input - all samples have zero or negative weights.";
And the final lines of the error:
INFO:
########### CLIB COMPILER OPTIMIZATION ###########
INFO: Platform :
Architecture: x64
Compiler : gcc
CPU baseline :
Requested : 'min'
Enabled : SSE SSE2 SSE3
Flags : -msse -msse2 -msse3
Extra checks: none
CPU dispatch :
Requested : 'max -xop -fma4'
Enabled : SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 AVX512F AVX512CD AVX512_KNL AVX512_KNM AVX512_SKX AVX512_CLX AVX512_CNL AVX512_ICL
Generated : none
INFO: CCompilerOpt.cache_flush[857] : write cache to path -> /tmp/pip-install-2tylos6b/scikit-learn_9f92637d15f141dfa069c8954878de3b/build/temp.linux-x86_64-cpython-310/ccompiler_opt_cache_clib.py
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for scikit-learn
Failed to build pandas scikit-learn
ERROR: Could not build wheels for pandas, scikit-learn, which is required to install pyproject.toml-based projects
I already have Pandas 1.4.4 and scikit-learn 1.1.2 installed.
I notice it's trying to install quite an old version of scikit-learn (0.24), can that be avoided?
Environment
- OS (e.g., Linux): Windows 11 host, running in WSL2 (Ubuntu).
- Python version: 3.10.7
- PyTorch/Lightning/Flash Version (e.g., 1.10/1.5/0.7):
- torch: 1.12.1+cu116
- pytorch-lightning: 1.7.7
- lightning-flash: 0.8.0
- GPU models and configuration: RTX3090
- Any other relevant information: From stack overflow, I see things about needing to do
sudo apt-get install python3-dev
to avoid the error aboutPython.h
. Surely I don't need to do that though, just to get an old version of a package I already have. And if so, should that be in the docs?
Additional context
pip install lightning-flash
works fine.
The above errors are all in WSL/Ubuntu. I have the same packages installed in my Windows machine (although they'll vary in minor versions) and there I get a different error. First, an error about not having C++ tools installed. I installed those. Then, running pip install lightning-flash[tabular]
I get another wall of errors. The last part is:
building 'pandas._libs.parsers' extension
"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.33.31629\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DNPY_NO_DEPRECATED_API=0 -I.\pandas\_libs -Ipandas/_libs/src/klib -Ipandas/_libs/src -IC:\Users\david\AppData\Local\Temp\pip-build-env-mp9k3z9t\overlay\Lib\site-packages\numpy\core\include "-IC:\Program Files\Python310\include" "-IC:\Program Files\Python310\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.33.31629\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" /Tcpandas/_libs/src/parser/io.c /Fobuild\temp.win-amd64-cpython-310\Release\pandas/_libs/src/parser/io.obj
io.c
pandas/_libs/src/parser/io.c(139): error C2065: 'ssize_t': undeclared identifier
pandas/_libs/src/parser/io.c(139): error C2146: syntax error: missing ';' before identifier 'rv'
pandas/_libs/src/parser/io.c(139): error C2065: 'rv': undeclared identifier
pandas/_libs/src/parser/io.c(145): error C2065: 'rv': undeclared identifier
pandas/_libs/src/parser/io.c(145): warning C4267: 'function': conversion from 'size_t' to 'unsigned int', possible loss of data
pandas/_libs/src/parser/io.c(146): error C2065: 'rv': undeclared identifier
pandas/_libs/src/parser/io.c(157): error C2065: 'rv': undeclared identifier
pandas/_libs/src/parser/io.c(158): error C2065: 'rv': undeclared identifier
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.33.31629\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pandas
Building wheel for antlr4-python3-runtime (setup.py) ... done
Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-py3-none-any.whl size=141210 sha256=a2f7093bde5cb7466fd2d761d5726bd87d75f82bd0bb277b2f9ee7d8d2357232
Stored in directory: c:\users\david\appdata\local\pip\cache\wheels\a7\20\bd\e1477d664f22d99989fd28ee1a43d6633dddb5cb9e801350d5
Building wheel for scikit-learn (pyproject.toml) ... done
Created wheel for scikit-learn: filename=scikit_learn-0.24.2-cp310-cp310-win_amd64.whl size=6269569 sha256=b0d4460d7c5b65d8daf43e24fb43b2bb641eeed2c55a66d61297c2469797aaf5
Stored in directory: c:\users\david\appdata\local\pip\cache\wheels\13\a4\68\4e78865652fa14db4a162b491e5138565f97646f9e1f2ab8cc
Building wheel for PyYAML (pyproject.toml) ... done
Created wheel for PyYAML: filename=PyYAML-5.4.1-cp310-cp310-win_amd64.whl size=45655 sha256=e80ae26593bbca066131a7044857c731fa526bed252d2a87ea8e96aee806bd97
Stored in directory: c:\users\david\appdata\local\pip\cache\wheels\c7\0d\22\696ee92245ad710f506eee79bb05c740d8abccd3ecdb778683
Building wheel for pyperclip (setup.py) ... done
Created wheel for pyperclip: filename=pyperclip-1.8.2-py3-none-any.whl size=11123 sha256=202846dfde61c94edf0b1755f7e41a5f5aafa99d701896228eadcee6dff07581
Stored in directory: c:\users\david\appdata\local\pip\cache\wheels\04\24\fe\140a94a7f1036003ede94579e6b4227fe96c840c6f4dcbe307
Successfully built pytorch-tabular antlr4-python3-runtime scikit-learn PyYAML pyperclip
Failed to build pandas
ERROR: Could not build wheels for pandas, which is required to install pyproject.toml-based projects
And to reiterate: I already have Pandas installed, this is blowing up trying to install a very old version of Pandas.
This is probably not related, but I notice that it downloads an older version of torch
.
I already have things working fine with torch
and pytorch-lightning
, I'm just interested in trying out the "easy" lightning-flash
. But so far can't get it installed in Windows OR Ubuntu. And I certainly don't want to go uninstalling packages that I have working so that they can be replaced with old versions being requested by flash.
Any ideas?
Hey, @davidgilbertson - Sorry that you are facing this issue. Since it says Python.h
not found, looks like it's missing the header file required - if you are using Ubuntu, and I assume you must be using apt
package manager, so can you please try: sudo apt install libpython3.x-dev
(where x will be your python 3.x version, for me it was 3.10
so I had to do: sudo apt install libpython3.10-dev
).
Please let me know if this solves or doesn't solve your issue.
Thanks @krshrimali I tried as you suggested and get more errors. I tried with --fix-missing
and got the same errors. Tried sudo apt update
and that resolved the errors.
Anyway, now I get new errors. It seems like this uninstalled my pytorch-lightning@1.7.7
and installed pytorch-lightning@1.3.6
and then I get an error:
lightning-bolts 0.5.0 requires pytorch-lightning>=1.4.0, but you have pytorch-lightning 1.3.6 which is incompatible.
And also I get the below error, not surprising since lightning-flash
brings in such an old scikit-learn
.
yellowbrick 1.5 requires scikit-learn>=1.0.0, but you have scikit-learn 0.24.2 which is incompatible.
If I may offer a suggestion: if this is to be a package aimed at being 'easy' to use, it really needs to work without having to install libpython-dev
and really needs to avoid installing very old dependencies.
Specifically, you could:
- use the current version of
pytorch-forecasting
, which doesn't requirescikit-learn>=0.23,<0.25
pytorch-tabular
has a hard requirement ofpandas==1.1.5
. They've relaxed this in their source code but haven't released a new version, you could push them to do so and use that version. Or, since that package seems pretty inactive, consider if you need it at all.
Then installs would be nice and smooth, probably