CI Timing Out Due to Persistant Flight Software Binary Threads
Closed this issue · 6 comments
TL;DR; PTest/thread handling in flight software sucks and isn't shutting down the flight software binary processes properly. Running all the mission checkouts locally on my desktop eventually pinned all 24 cores and grabbed 32 gigs of RAM plus 38 of swap crashing other processes running in the background -- like my chrome tabs.
Upgrade PTest to properly shut things down.
So what I think something like this is happening once I added changes to handle signals properly:
- Upon PTest termination we send
SIGTERM
to the main flight software process thread. - The main thread then tells the reader thread in
debug_console
to stop and blocks until that thread exits. - The reader thread however, is blocked waiting for another line of input and therefore never exits.
- This means the main thread doesn't exit, and we just get stuck with extra processes left around the system.
- Not really sure why this grabs more and more memory and pins the CPU however.
Merging this should hopefully unblock other CI blocked PR's
Implementation for PR copied from here:
https://stackoverflow.com/questions/15524122/how-to-implement-timeout-for-getline
Added this BLAS limiter to pass CI, unsure if its spurred by this PR but it should be fine to add:
https://stackoverflow.com/questions/52026652/openblas-blas-thread-init-pthread-create-resource-temporarily-unavailable