Proposed solution doesn't seem to work anymore
Paskin opened this issue · 4 comments
With Host OS 2.107.13+rev1, as well as 2.106.8 - rmmod of Nouveau module causes kernel exception in it (infamous nvkm_falcon_v1_wait_for_halt), leaving GPU in some intermediate state and crashing Nvidia original drivers loaded later.
@Paskin are you referring to the Generic x86_64 (GPT) device type? Was 2.105.32+rev2 the last version that worked for you?
are you referring to the Generic x86_64 (GPT) device type?
Yes
Was 2.105.32+rev2 the last version that worked
Did some research - no version with 5.x kernel seems to work. The only "hack" I found is to remove nouveau module from the installation flash image (unfortunately - do not remember why blacklisting wasn't helping, and do not have this machine anymore).
I wish Balena will produce an image with official support for nvidia-contatiner-toolkit etc one day (especially for Jetsons, where lack of GPU support invalidates the major point of the platform) - letting customers to run an unmodified containers from NVCR.
Does anyone from Balena have an update on this?
Sometimes the Nouveau module can't be unloaded because Plymouth is using it. There's an example of how to work around this on our forums that may be helpful: https://forums.balena.io/t/blacklist-drivers-in-host-os/163437/25