ServiceNow/picard

Model training fails

Opened this issue · 3 comments

When I run the command 'make train' or 'make eval device=-1' it seems to fail. Here's the resultant output:
make train fail

I'm running this through WSL2 using an AMD GFX card, but Picard apparently only uses Nvidia GFX cards? I installed the git repo using HTTP instead of the recommended SSH since I didn't have a key.

Please, let me know if my setup is incorrect.

Edit:
It looks like the given error code results when there's insufficient memory. Is the model completely untrainable without 40GB of GPU memory? Or do some settings just need to be changed and more time taken?

Hi @SethCram
Did you solve the setup issue?

@vaib26 No, I did not. I still can't train, evaluate, or serve the model

@vaib26 I don't have an NVIDIA Graphics card, so i was able to run 'make serve' after changing

"device": 0
the device number from 0 to -1.