OCI runtime create failed ( OS Ubuntu 20.04)
pwnedzen opened this issue · 8 comments
when running the docker-compose build && docker-compose up
I get an OCI runtime creage failed ( posted below):
Sending build context to Docker daemon 1.724MB
Step 1/7 : FROM python:3
---> 6beb0d435def
Step 2/7 : ENV PYTHONUNBUFFERED=1
---> Using cache
---> 2aab34bae599
Step 3/7 : WORKDIR /clifs
---> Using cache
---> 7c6854350226
Step 4/7 : COPY requirements.txt /clifs
---> Using cache
---> df31f0e3f4ff
Step 5/7 : RUN pip install -r requirements.txt
---> Running in 8eca7fd2853d
OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:722: waiting for init preliminary setup caused: EOF: unknown
ERROR: Service 'web' failed to build : Build failed ```
**any ideas I have tried running the command as sudo and user. Upon my digging I found the error might contain to setup of Dockerfile but not sure.**
Not seen this issue before. I would check that docker has enough drive space available and that the docker version is relatively new. I guess you could also try DOCKER_BUILDKIT=1 just ahead of your build command.
Ran with that appended to the command with some other minor changes and new error message is this:
Attaching to clifs-web-1, search-engine
clifs-web-1 | standard_init_linux.go:228: exec user process caused: no such file or directory
Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown```
**any other advice would be appreciated thanks, still debugging on my end as well.**
Alright, so it seems to build. If you're trying to run the GPU-enabled docker-compose file, make sure you've got the nvidia-container-toolkit installed as well as a NVIDIA driver that works for CUDA 10.1.
You could check just your NVIDIA setup by running e.g., the image used in this app (pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel) and test that the GPU is available to it.
Still looks like I am having trouble finding the correct NVIDIA driver. Do you have link or more information on which on you used, and where you downloaded it from. This is the last error I am encountering:
Successfully tagged clifs_search-engine:latest
Starting clifs_web_1 ... done
Starting search-engine ... done
Attaching to search-engine, clifs_web_1
web_1 | standard_init_linux.go:228: exec user process caused: no such file or directory
clifs_web_1 exited with code 1
search-engine | * Serving Flask app 'server' (lazy loading)
search-engine | * Environment: production
search-engine | WARNING: This is a development server. Do not use it in a production deployment.
search-engine | Use a production WSGI server instead.
search-engine | * Debug mode: off
search-engine | /opt/conda/lib/python3.7/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
search-engine | return torch._C._cuda_getDeviceCount() > 0
search-engine | 2021-10-04 13:51:48,526 - * Running on all addresses.
search-engine | WARNING: This is a development server. Do not use it in a production deployment.
search-engine | 2021-10-04 13:51:48,526 - * Running on http://172.19.0.3:5000/ (Press CTRL+C to quit)
^CGracefully stopping... (press Ctrl+C again to force)
**I already have NVIDIA driver on my machine but torch still not finding it. Thanks**
Make sure you are using the GPU docker-compose file (as stated in instructions) and make sure you have the correct nvidia drivers and cuda container toolkit. For the latter two items, I think googling is the easiest solution as it can be a few steps to setup.
I have checked and resolved these issues for now, am now running into the same issue posted by
hoel-bagard: OpenCV error when opening the video #6
search-engine |
search-engine | Traceback (most recent call last):
search-engine | File "server.py", line 8, in <module>
search-engine | clifs = CLIFS()
search-engine | File "/app/clifs.py", line 30, in __init__
search-engine | self.add_video(f)
search-engine | File "/app/clifs.py", line 71, in add_video
search-engine | feature_t = torch.cat(feature_list, dim=0)
search-engine | RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].
Great. That's not an error I've seen earlier either, but I will look into it. Closing this issue as it is now covered by #6
Just wanted to add, started fresh with a Ubuntu 18.04 install on a VM (not WSL). And got the project working! Really amazing project johanmodin
Thanks for your work and responses!