Jetson docker image consuming a lot of resources when idling
Opened this issue ยท 13 comments
Awesome work!
Everything is running fine but I'm surprised the CPU is running this much when it's idling. I was looking at the code, guessing the main loop is not waiting for input but ran out of time finding where detection.py's objectdetection() was called and what delay param was used.
Running docker with:
sudo docker run --runtime nvidia --restart unless-stopped -e VISION-DETECTION=True -p 80:5000 deepquestai/deepstack:jetpack-x3-beta
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
5f58358a348c hopeful_gnu 31.63% 1.475GiB / 3.863GiB 38.18% 51.5MB / 305kB 254MB / 45.1kB 15
Ok, looking at the code I see SLEEP_TIME = 0.01
in shared.py which is unfortunate since all the other options allow environment overrides. All the docker images supply the ENV SLEEP_TIME 0.01
so it looks like ti was just a miss.
Short term solution:
Change line 38 in shared.py from
SLEEP_TIME = 0.01
to
SLEEP_TIME = os.getenv("SLEEP_TIME", 0.01)
I don't do much Python so may need some string to number conversion.
Long term solution:
Don't busy wait with a delay. Wake up the thread when there is a new image processing request.
Ok lunch is over so back to work here. I looked into Redis and redis-py and it seems like there is a PubSub system for subscribing to database events. A kinder (to the CPU) way would be to subscribe to events related to the IMAGE_QUEUE key and putting the detection loop in the callback as described here:
https://redis.io/topics/notifications
https://stackoverflow.com/questions/55112705/redis-python-psubscribe-to-event-with-callback-without-calling-listen
I may look at making this change next week in a fork but just thought I would think out loud here in case someone else is looking at this.
Thank you @MFornander
This is a great suggestion. I am looking forward to the PR.
You might find this useful,
https://aioredis.readthedocs.io/en/v1.3.0/examples.html
I managed to fork, build the go server, build a new docker image but having issues getting a working image.
YOu guys are probably busy but if you complete the Build from Source
section, you may get more people helping out.
Where I am right now:
- Fork DeepStack to mattiasf/DeepStack
- Clone my fork to Jetson (building locally)
- Download arm64 go compiler from https://golang.org/dl/
- Compile go server: cd server && go compile && cd ..
- Build docker image: docker build -t mattiasf/deepstack:jetpack -f Dockerfile.gpu-jetpack .
- Run local image:
sudo docker run --runtime nvidia --restart unless-stopped -e MODE=High -e VISION-DETECTION=True -e SLEEP_TIME=1.0 -p 80:5000 mattiasf/deepstack:jetpack
Server comes up but doesn't respond to REST calls. What's missing? I can't really help without getting the unmodifed fork up and running.
Hello, i will update the readme with build instructions today. In the meantime, when you run deepstack. You can view the logs to see what went wrong. Will add guide for that too.
Do the following
- Get the name of the container with
sudo docker ps
- Run
sudo docker exec -it container-name
- Once inside the container, run
apt-get install nano
- cd to
cd /logs/
- Open the errror logs with
nano stderr.txt
Let me know what the content of the file is
Thanks @johnolafenwa !
Had to make some changes to your steps and including them here in case someone else follows here. Skipped nano too.
sudo docker ps
sudo docker exec -it {container-name} bash
cat /app/logs/stderr.txt
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/app/intelligencelayer/shared/detection.py", line 66, in objectdetection
detector = YOLODetector(model_path, reso, cuda=CUDA_MODE)
File "/app/intelligencelayer/shared/process.py", line 36, in __init__
self.model = attempt_load(model_path, map_location=self.device)
File "/app/intelligencelayer/shared/models/experimental.py", line 159, in attempt_load
torch.load(w, map_location=map_location)["model"].float().fuse().eval()
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 585, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 755, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.
Hello @MFornander , Thanks for getting this to work and posting your steps.
It appears when you cloned the repo, you didn't fetch the model files with git lfs, this would have resulted in the model files being invalid.
You need to install git lfs and run git lfs fetch
from the repo root,
This will fetch all the model files, then build deepstack after
I found out Blue Iris just had native support for Deepstack instead off using AITool to act as a middleware between BI and Deepstack. It also supported Deepstack on Windows and I saw there was a new build. I installed Deepstack on Windows and everything seemed to run great. I noticed though that Python was consuming 10-20% of the CPU (on a fairly power CPU). I read the Deepstack on Windows not as efficient so I thought that is where the Python resource was coming from.
So I went back to my Linux docker image and upgraded it to the latest. Low and behold, python is also running the system hard. PS shows: "python3 /app/intelligencelayer/shared/detection.py"
I tried deleting the entire image and starting from a fresh pull and it's still occurring. My docker start command:
docker run --detach --name=deepstack --restart=always -e MODE=High -e VISION-DETECTION=True -e VISION-FACE=True -v localstorage:/datastore -p 5000:5000 --name deepstack deepquestai/deepstack:latest
I think I might need to roll back to an older version as the resource hit not acceptable for a idle application.
I rolled back a zfs snapshot that contained the docker image from earlier this morning and this system is running well again. Not sure how to identify the deepstack version:
ocker images
REPOSITORY TAG IMAGE ID CREATED SIZE
deepquestai/deepstack latest 8b917481f961 2 months ago 2.97GB
So I spoke too soon about the older version not having the idle resource issue. It seems as soon as Blue Iris connects to Deepstack the resource usage goes up. Old version included. Not sure what Blue Iris is doing in the background.
same issue here with the normal docker image. Deepstack container is idling around 10-15% on an Xeon E-2224 xen CPU. Python3 creating the load. Hope there will be a fix soon - load should not be that high when it not doing anything. @johnolafenwa: any idea what is creating this idle usage?
I worked with Ken at Blue Iris to figure out the problem that was causing the load and a memory leak. When using PTZ cameras on a patrol, having the camera Deepstack option "Detect/Ignore static objects" caused this problem. Disabling it and all was happy again.
I am not using blue iris. I just irregulary send pictures to the docker container (which only has VISION-FACE=True enabled) for face recognition.