aws-neuron/aws-neuron-sdk

Is it possible to compile a model when no NeuronCores are available?

CozyDoomer opened this issue · 2 comments

I am trying to compile a model locally to run on Inf instances but am running into this error:
RuntimeError: Bad StatusOr access: INVALID_ARGUMENT: PJRT_Client_Create: error condition nullptr != (args)->client->Error(): Init: error condition !(num_devices > 0)

This indicates to me that no NeuronCores are found on my system. I couldn't find confirmation that compilation using torch_neuronx.trace is only possible when NeuronCores are available.

Is this a requirement? and if so this means compilation is only possible using one of the NeuronCore instance types?

Additional info:
I'm using this docker image locally:
public.ecr.aws/neuron/pytorch-inference-neuronx:2.1.2-neuronx-py310-sdk2.18.2-ubuntu20.04

Python version==3.10.12
torch==2.1.2
torch-neuronx==2.1.2.2.1.0
torch-xla==2.1.2

This issue seems similar:
#902 (comment)

By default, torch_neuronx.trace() supports compilation on non-Neuron devices for torch 1.13. However, a Neuron device is currently required for torch 2.x compilation. This constraint may be removed in a future release.

Thank you for the fast response!