libopenblas on raspberry pi renders libsurvive unusable
jamuus opened this issue · 6 comments
Hey,
Not sure if its a bug but I spent a long time debugging a performance issue I had. I have two identical raspberry pi's(4), one with a fresh install of Raspberry pi os with libsurvive + recommended apt-get installs, and one setup a while ago by installing whatever the build complained about next and apt-get'ing that. The pi I setup a while ago has great performance, using about 30% cpu on one thread, outputting frequent updates with no sign of tracking issues or performance issues.
The fresh pi was reporting tracking issues(plus "... is probably dropping IMU packets") and had all 4 cores pegged at 100%.
After a while of swapping kernel versions and diffing apt list --installed
I found removing openblas solved the issue. Specifically apt remove libopenblas-base
(the -dev package didn't appear to affect anything). For me, this is 100% reproducible, reinstalling that package reintroduced the high CPU and unusable/unstable tracking results.
I guess something is up with the performance of that lib on rpi/arm, I haven't checked what is getting linked to without it installed, I have no blas packages from apt installed at this point.
Hopefully this post has enough keywords to help someone else with this issue, I saw there are a few other posts here either struggling with an rpi or mixed advice about how to get it to work.
As an aside, once I fixed this issue I tested on the new 64 bit rpi images and am getting a 15+% reduction in reported CPU usage compared to 32 bit image, on a slightly modified api_example with one vive tracker over USB.
Thanks you guys for the work on this project
This is a somewhat known bug but not documented outside of the discord channel I think so this is a good bug report. It might be caused by an openblas update?
There is a reference implementation of blas provided by lapack itself. Not sure if debian/ubuntu provide it, and it says
A Fortran 77 reference implementation of the BLAS is available from netlib; however, its use is discouraged as it will not perform as well as a specifically tuned implementation.
but at least it works properly as opposed to openblas.
libsurvive has its own implementation of the relevant math without blas at all. You get that one if you uninstall lapacke/your blas implementation, or if you compile with cmake -DUSE_EIGEN=ON
..
Might be worth checking out which one is more cpu efficient on low power devices like a raspberry pi. There might also be other blas implementations, if there are some tuned for the raspberry pi it might give a small edge?
Was this with the latest code? I thought I had put an end to the blas bug a while ago.
That's something I should have checked, I last pulled a few months ago so not quite origin/master. I'll see if I see the same behaviour there.
Unfortunately with a fresh clone I'm seeing segfaults when trying to calibrate and repeating tracker disconnects using an existing calibration.
I did a quick manual bisect and found a commit in December that didn't fault and also showed no issue with openblas being installed. So it does look like the issue I saw was fixed at some point; although a cursory check on reported CPU utilisation showed around 60% of a core so I guess the libraries in use have changed.
I'll see if I can dig into the new issues I'm seeing, and maybe have a go at configuring different blas libs to understand what the performance implications are.
Cheers
Managed to do a proper bisect for the disconnect/seg fault issue. Ended up being bd653694476 "Reintroduce timeout for usb events" causing issues, on my setup even increasing the timeout to 10 seconds didn't seem to prevent the device being reported as disconnected and reconnecting periodically. Had to unset the timeout to get it working consistently. But that's not related to this issue.
I had a quick go at configuring different libraries but I'm having difficulties turning off Eigen so it would use one of the blas libs. Even replacing the USE_EIGEN=ON references it still wanted to download and compile Eigen and I would get "Using eigen backend". But I do see the submodules seem to be configurable separately so maybe I'm getting confused about the output.
Anyway, after getting master to compile and fiddling with the cpu governor I'm seeing great tracking stability and performance while only using 11% CPU. This is with a single vive tracker 3.0 over USB and 2xlighthouse 2.0