Deepstack gpu docker image timeout
Opened this issue · 18 comments
System:
Debian Buster 10
Backports buster linux drivers
GTX 960 GPU
Hey there, I'm using the latest image from docker hub deepquestai/deepstack:gpu
.
After following guide here, I manged to launch deepstack:gpu container but everytime I send an image for detection I get timeout error.
{'success': False, 'error': 'failed to process request before timeout', 'duration': 0}
Steps I took:
- Installed latest docker, nvidia-docker2 and deepstack:gpu docker image
- Started container with
sudo docker run --gpus all -e VISION-DETECTION=True -v localstorage:/datastore -p 5000:5000 deepquestai/deepstack:gpu
- Tried to send image from here with same python code to running container to
localhost
- Got timeout error after 1 min
More info:
sudo nvidia-docker run --name=deepstack --gpus all -e MODE=High -e VISION-DETECTION=True -v deepstack:/datastore -p 5000:5000 deepquestai/deepstack:gpu
DeepStack: Version 2021.02.1
/v1/vision/detection
---------------------------------------
---------------------------------------
v1/backup
---------------------------------------
v1/restore
[GIN] 2021/04/02 - 22:05:09 | 500 | 1m0s | 172.17.0.1 | POST /v1/vision/detection
Host Nvidia SMI
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 960 On | 00000000:07:00.0 Off | N/A |
| 7% 45C P8 14W / 130W | 1MiB / 2000MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
root@21951afe7542:/app/server# cat logs/stderr.txt exit status 1chdir intelligencelayer\shared: The system cannot find the path specified
root@21951afe7542:/app/server# cat ../logs/stderr.txt
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/app/intelligencelayer/shared/detection.py", line 69, in objectdetection
detector = YOLODetector(model_path, reso, cuda=CUDA_MODE)
File "/app/intelligencelayer/shared/./process.py", line 36, in __init__
self.model = attempt_load(model_path, map_location=self.device)
File "/app/intelligencelayer/shared/./models/experimental.py", line 159, in attempt_load
torch.load(w, map_location=map_location)["model"].float().fuse().eval()
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 842, in _load
result = unpickler.load()
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 834, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 823, in load_tensor
loaded_storages[key] = restore_location(storage, location)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 803, in restore_location
return default_restore_location(storage, str(map_location))
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 174, in default_restore_location
result = fn(storage, location)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 150, in _cuda_deserialize
device = validate_cuda_device(location)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 134, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Same. Shame as it has been solid for months.
I wonder when there is going to be an update? It has been two months since a checkin of this project. It is a great project and it would be a shame to see it abandoned!
Hello @rickx34 @Kosh42 @gillonba
Thanks for reporting this. Sorry we have not been able to attend to issues for a while now. We have an update to DeepStack coming this month.
On this issue, it appears DeepStack is unable to detect the gpu. Also, i notice from the results of nvidia-smi above that cuda version is N/A (CUDA Version: N/A )
Did you attempt to install cuda and what version of cuda was installed?
@johnolafenwa I think the output of nvidia-smi
is from host, I presume the docker image has cuda installed, I can nvidia-smi
within the docker image and can get cuda version
Folks, its a shame but we have to update the docs for the docker on linux...
When you run a docker with GPU on linux you have to pass --privileged parameter so the container can access NVIDIA devices on the host. You can also mess with --device param but the quickest way would be just --privileged.
docker run --gpus all --privileged ...<rest of the parameters>
I'm having this exact problem and same error on debian 11 but haven't been able to get past it. I tried --privileged
as well.
Have CPU working for all three VISION-SCENE, VISION-DETECTION, VISION-FACE. Really nice work!
Now with GPU option only VISION-SCENE, VISION-DETECTION are working. The VISION-FACE is timing out:
[GIN] 2022/03/18 - 22:53:21 | 500 | 1m0s | 172.17.0.1 | POST "/v1/vision/face/"
docker run --gpus all --privileged -e VISION-FACE=True -v /mnt/user/security/datastore:/datastore -p 5000:5000 deepquestai/deepstack:gpu-2022.01.1
Also tried deepquestai/deepstack:gpu-x5-beta with the same result.
Running Intel Core i5-6500 and GeForce GTX 1050 on Ubuntu 20.04 LTS, downloaded today, fresh install.
Cuda working inside docker as below is test / output:
docker run --gpus all nvidia/cuda:11.0-base nvidia-smi
NVIDIA-SMI 510.54
Driver Version: 510.54
CUDA Version: 11.6
Simple install code to make it quick to replicate, also includes python test scripts
install-notes.txt
python.zip
.
Also installed Nvidia cudnn8 with same timeout happening with DeepStack GPU Face, below steps taken..
OS="ubuntu2004"
sudo apt-get update
get https://developer.download.nvidia.com/compute/cuda/repos/${OS}/x86_64/cuda-${OS}.pin
sudo mv cuda-${OS}.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/${OS}/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/${OS}/x86_64/ /"
apt search libcudnn
apt-get install libcudnn8 libcudnn8-dev
must have been the memory of the graphics card, all working on the 1080ti which has 11GB of memory
Hi,
gpu-2022.01.1
does not work for me on any endpoints. I get a timeout after 1m.
gpu-2021.09.1
works for me on every endpoints even without --privileged
.
I have a GeForce GTX 1650 4G.
Here's my nvidia-smi
on gpu-2022.01.1
:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 1650 Off | 00000000:01:00.0 Off | N/A |
| 36% 40C P8 7W / 75W | 0MiB / 3909MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Here's my nvidia-smi
on gpu-2021.09.1
:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 1650 Off | 00000000:01:00.0 Off | N/A |
| 37% 43C P0 21W / 75W | 3072MiB / 3909MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
EDIT: notice the memory is 0MiB / 3909MiB
on the gpu-2022.01.1
... so nothing has been loaded I guess.
Same issue for me with a GTX 750Ti, works perfectly with gpu-2021.09.1 but not with gpu-2022.01.1
Any update to this?
Just an FYI, I ran across the same issue after redis crashed on the docker system I was running on. Probably not the most common cause, but the timeouts happened the same and tailing /app/logs/stderr.txt in the container revealed the issue.
So 1y and half later and no news on something like gpu usage on a image recognizing software.
Any news on this subject ?
Any news on this subject ?
The project has been dead for over two years. You can switch to CodeProject AI. https://www.codeproject.com/AI/docs/
Oh thank you for pointing me to the successor.
Just a quick question, I see :
The Docker GPU version is specific to nVidia's CUDA enabled cards with compute capability >= 6.0
So my graphic card with compute compatibility 3.0 is useless with this project I guess ? Just to be sure if there is a way to use it anyway or not ?
I use the windows version so I can't speak to specifics. But, I'd grab the Docker CPU version, then once it's installed there are multiple modules you can use that will operate on older GPUs. Within CPAI are multiple types of processing modules available to install and use for image processing (and a few for sound, facial recognition, text process like license plates, etc.).