Failed to create a Pyinstaller app that includes dvc
jorgegarcia-ea opened this issue ยท 8 comments
Bug Report
Failed to create a Pyinstaller app that includes dvc
Description
We are trying to make a pyinstaller app that includes dvc and after generating the spec we get this error:
PyInstaller.exceptions.ImportErrorWhenRunningHook: Failed to import module __PyInstaller_hooks_0_dvc required by hook for module D:\test_venv\Lib\site-packages\dvc\__pyinstaller\hook-dvc.py. Please check whether module __PyInstaller_hooks_0_dvc actually exists and whether the hook is compatible with your version of D:\test_venv\Lib\site-packages\dvc\__pyinstaller\hook-dvc.py: You might want to read more about hooks in the manual and provide a pull-request to improve PyInstaller.
Reproduce
- Create a python venv with pyinstaller and dvc in the requirements
- Add some source files that use dvc
- Generate a pyinstaller spec
- Run pyinstaller on the spec
Expected
Can create a pyinstaller app without errors
Environment information
Windows 10, python 3.11.6, pyinstaller 6.4.0, dvc 3.47.0
Hey @jorgegarcia-ea , we are a small team and won't be able allocate (unless someone already has good knowledge of what is happening there) time into research on such a specific setup - custom PyInstaller. Depending on the context you could probably reach out and engage with us https://dvc.org/support or try to dig into this more and keep sharing the results here. The first step should be probably an easily reproducible bundle that anyone could run + more details on you think it is happening.
@jorgegarcia-ea Any particular reason you are trying to build it yourself? We provide a windows package that has a pyinstaller-built binary inside, e.g. https://dvc.org/download/win/dvc-3.47.0 . Here's the whole process of building that https://github.com/iterative/dvc-exe
Hi @shcheklein and @efiop, many thanks for your quick responses. Here's additional details of the issue we are experiencing from my colleague, hope they are helpful:
We want to create an executable of our application. The goal is to reduce the size by avoiding packaging some data and instead using DVC. The code will download the data stored in the remote storage the first time the app starts.
We have been able to do this locally using DVCFileSystem and get_file(). However, when we try to build the exe with pyinstaller we get the error above.
Please find a minimum reproduction example attached. Let me know if you have any questions or need additional details.
Oh, so you are building a standalone app where dvc is a dependency, I see now. Thanks for clarifying.
So which commands are you running? You've shared a part of the error and source files, but not full error log or particular pyinstaller commands you were running to generate the spec and build the app.
Also looking at main.py
, I see that you make some very strong assumptions about dvc file availability by using cwd with dvcfs, and I don't think that will work because you need to tell pyinstaller what files you want to include in the resulting app. So probably you'll run into more issues, but they are not really dvc specific but more about using pyinstaller itself and we'll not able able to help you there.
Hello, thanks for your help, I'm collaborating with @jorgegarcia-ea on this project.
We would like to know if it is even possible to run dvc commands inside a standalone app created with pyinstaller, since the error we get happens when building the executable.
We are planning on adding the necessary files to the bundle. For now, let's assume we will receive an error at runtime due to lack of files.
In the minimal example just running pyinstaller main.py
will return the above error. The whole stacktrace can be found attached (notice I'm using conda instead of venv)
dvc_pyinstaller_test_stacktrace.txt
Let us know if you need more information to reproduce the error.
Thank you!
@MonicaVillanueva Thanks for the log! The error there is actually:
importlib.metadata.PackageNotFoundError: No package metadata was found for adlfs
which probably means that you don't have adlfs
installed. I suppose you've installed dvc through pip, so you probably need to pip install 'dvc[all]'
to install all optional dependencies. That's just the way we build pyinstaller apps ourselves, so we expect all extra dependencies to be installed. There are ways to avoid that, but that will require modifying the hook.
Hey @efiop, sorry for the delay
Thanks for the suggestion, it works!
I'm working now towards making the example above work (e.g. use dvc inside the executable).
I will update here when I get it working. It might be useful for other people and you could add it to the documentation if you find it valuable.
Thanks for your help
For future reference:
There are a couple of tricky things:
- As kindly suggested above, you need to install all dvc dependencies with
pip install dvc[all]
- It is necessary to move de
.dvc
folder inside the executable bundled dir_internal
. If you don't haveno_scm = true
you will need to add it to.dvc/config
. Otherwise, when you run the exe it will complain that it is not a git repository - I realized using
fs.find
that the dvc file needs a slash in front of the path orfs.get_file
won't work
The rest is standard pyinstaller:
- You need to add your dvc file using
pyinstaller main.py --add-data "test_file.txt.dvc;."
- You need to set the DVCFileSystem using the absolute path to the bundle folder stored in
sys._MEIPASS
e.g.DVCFileSystem(sys._MEIPASS)
Find the updated project here: dvc_pyinstaller_test_updated.zip