ndif-team/nnsight

Llama remote execution breaks on transformers 4.40.2

sheikheddy opened this issue · 2 comments

This PR broke LanguageModel by adding an mlp bias to the config: https://github.com/huggingface/transformers/pull/30031/files

from nnsight import LanguageModel

# We'll never actually load the parameters so no need to specify a device_map.
model = LanguageModel("meta-llama/Llama-2-70b-hf")

# All we need to specify using NDIF vs executing locally is remote=True.
with model.trace("The Eiffel Tower is in the city of", remote=True) as runner:

    hidden_states = model.model.layers[-1].output.save()

    output = model.output.save()

print(hidden_states)

print(output["logits"])

Still works on transformers 4.40.0.

@sheikheddy Can you post the stack trace? I ran with the lastest transformers and the one from github and they both worked.