Bug: Segmentation fault in sqrl_pytorch-PyTorch CUDA
awf opened this issue · 2 comments
Easiest way to replicate the situation above is to edit launch.json to include
{
"name": "(gdb) pytest",
"type": "cppdbg",
"request": "launch",
"program": "/anaconda/envs/knossos/bin/python",
"args": [
"-m",
"pytest",
"src/bench/",
"-v",
"--modulepath=examples/dl-capsule/sqrl",
"--benchmarkname=sqrl",
],
"stopAtEntry": false,
"cwd": "${workspaceFolder}",
"environment": [
{"name":"PYTHONPATH", "value":"./src/python"}
],
"externalConsole": false,
"MIMode": "gdb",
"setupCommands": [
{
"description": "Enable pretty-printing for gdb",
"text": "-enable-pretty-printing",
"ignoreFailures": true
}
]
},
And then "Debug: Select and Start Debugging" in VS Code, picking "(gdb) pytest".
The problem is that we have
@knossos.register
def sqrl(x: torch.Tensor):
...
def sqrl_pytorch(x: torch.Tensor):
return sqrl(x)
which means that sqrl_pytorch
isn't actually a PyTorch implementation at all: it calls the Knossos implementation. I think this was accidentally broken by the addition of the knossos.register
decorator in #960. We'll need to rewrite sqrl_pytorch
so that it's a genuine PyTorch implementation.
Before #976 was merged this morning, functions defined using @knossos.register
were compiled for CPU only; but the "PyTorch CUDA" benchmark puts the input tensors on the GPU. The segmentation fault occurs when trying to read this data on the CPU.
After #976 is merged, the KscStub detects that the input is on the GPU and tries to compile for the GPU, but this raises an error ("Only elementwise operations can be compiled for GPU"), which I think is the correct behaviour. There is no "Knososs CUDA" benchmark for sqrl, because the "Knossos CUDA" benchmark is only enabled for elementwise operations.