aws-neuron/transformers-neuronx

Any solution to save the converted model?

Opened this issue · 3 comments

Converting the loaded model using to_neuron() method takes a long time. Is there any way to Save the neuron_model on disk and load it again? This is for GPT-NeoX.

Hi @aliseyfi:

We are working on adding serialization support for all models in an upcoming release. We will update this ticket when serialization support is available.

Hey @aliseyfi , does model.save work for you?

Example code from https://huggingface.co/aws-neuron/Mistral-neuron:

model_neuron = MistralForSampling.from_pretrained('mistralai/Mistral-7B-Instruct-v0.1-split', batch_size=1, \
    tp_degree=2, n_positions=256, amp='bf16', neuron_config=neuron_config)
model_neuron.to_neuron()

#save compiled neff files out to the same directory
model_neuron.save("mistralai/Mistral-7B-Instruct-v0.1-split")

Hey @aliseyfi , does model.save work for you?

Example code from https://huggingface.co/aws-neuron/Mistral-neuron:

model_neuron = MistralForSampling.from_pretrained('mistralai/Mistral-7B-Instruct-v0.1-split', batch_size=1, \
    tp_degree=2, n_positions=256, amp='bf16', neuron_config=neuron_config)
model_neuron.to_neuron()

#save compiled neff files out to the same directory
model_neuron.save("mistralai/Mistral-7B-Instruct-v0.1-split")

Sorry, I don't work on that project anymore. Thanks for the update though.