davidADSP/Generative_Deep_Learning_2nd_Edition

docker-compose.gpu.yml error upon running

Opened this issue · 1 comments

I'm on Windows with NVIDA 4070 and seeing the below error when trying to launch docker with docker-compose.gpu.yml. Any idea how to resolve this?

PS C:\Projects\GenerativeDeepLearning> docker compose -f docker-compose.gpu.yml up
[+] Running 2/0
✔ Network generativedeeplearning_default Created 0.0s
✔ Container generativedeeplearning-app-1 Created 0.1s
Attaching to app-1
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 2, stdout: , stderr: fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7fcd133aad54]

runtime stack:
runtime.throw({0x5286a1?, 0x6d?})
/usr/local/go/src/runtime/panic.go:992 +0x71
runtime.sigpanic()
/usr/local/go/src/runtime/signal_unix.go:802 +0x389

goroutine 1 [syscall]:
runtime.cgocall(0x4f48d0, 0xc00017d958)
/usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc00017d930 sp=0xc00017d8f8 pc=0x40523c
github.com/NVIDIA/go-nvml/pkg/dl._Cfunc_dlopen(0x9c8820, 0x1)
_cgo_gotypes.go:113 +0x4d fp=0xc00017d958 sp=0xc00017d930 pc=0x4ee78d
github.com/NVIDIA/go-nvml/pkg/dl.(*DynamicLibrary).Open(0xc00017da30)
/go/src/nvidia-container-toolkit/vendor/github.com/NVIDIA/go-nvml/pkg/dl/dl.go:55 +0x74 fp=0xc00017d9d0 sp=0xc00017d958 pc=0x4ee994
gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/info.(*infolib).HasNvml(0xc00012c1e0?)
/go/src/nvidia-container-toolkit/vendor/gitlab.com/nvidia/cloud-native/go-nvlib/pkg/nvlib/info/info.go:47 +0x85 fp=0xc00017da68 sp=0xc00017d9d0 pc=0x4eed85
github.com/NVIDIA/nvidia-container-toolkit/internal/info.ResolveAutoMode({0x54f5c8, 0x6333e0}, {0xc000138157?, 0x52974f?})
/go/src/nvidia-container-toolkit/internal/info/auto.go:42 +0x1bb fp=0xc00017db18 sp=0xc00017da68 pc=0x4ef53b
main.doPrestart()
/go/src/nvidia-container-toolkit/cmd/nvidia-container-runtime-hook/main.go:77 +0xdd fp=0xc00017df08 sp=0xc00017db18 pc=0x4f2e7d
main.main()
/go/src/nvidia-container-toolkit/cmd/nvidia-container-runtime-hook/main.go:176 +0x11e fp=0xc00017df80 sp=0xc00017df08 pc=0x4f43de
runtime.main()
/usr/local/go/src/runtime/proc.go:250 +0x212 fp=0xc00017dfe0 sp=0xc00017df80 pc=0x4368d2
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00017dfe8 sp=0xc00017dfe0 pc=0x460981: unknown

svrc commented

Try it in a WSL2 instance & terminal window? This looks like the underlying NVIDIA library is panicking. Docker Desktop might not be able to reach the NVIDIA card unless it's running in WSL2 mode?