Reproducing model conversions
thekevinscott opened this issue · 6 comments
Question
I'm trying to reproduce the conversion of phi-1_5_dev
to better understand the process. I'm running into a few bugs / issues along the way that I thought it'd be helpful to document.
The model @Xenova/phi-1_5_dev
states:
https://huggingface.co/susnato/phi-1_5_dev with ONNX weights to be compatible with Transformers.js.
I'm doing the following:
git clone https://github.com/xenova/transformers.js.git && cd transformers.js/scripts
git clone https://huggingface.co/susnato/phi-1_5_dev
python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
python3 convert.py --quantize --model_id phi-1_5_dev --task "text-generation"
Here, I hit my first issue - it looks like transformers
on pypi
does not support Phi:
raise KeyError(key)
KeyError: 'phi'
So I install from Github:
pip install git+https://github.com/huggingface/transformers.git
That produces:
RuntimeError: Failed to import optimum.exporters.onnx.__main__ because of the following error (look up to see its traceback):
cannot import name 'is_torch_less_than_1_11' from 'transformers.pytorch_utils' (/Users/thekevinscott/code/codegen/research/model-conversion/throwaway/transformers.js/scripts/.venv/lib/python3.10/site-packages/transformers/pytorch_utils.py)
I believe optimum
is also out of date:
pip install git+https://github.com/huggingface/optimum.git
With those two dependencies updated, this command now works:
python3 convert.py --quantize --model_id phi-1_5_dev --task "text-generation"
Though there are a few warnings I'm assuming I can ignore:
Ignore MatMul due to non constant B: /[/model/layers.22/self_attn/MatMul]
Ignore MatMul due to non constant B: /[/model/layers.22/self_attn/MatMul_1]
Ignore MatMul due to non constant B: /[/model/layers.23/self_attn/MatMul]
Ignore MatMul due to non constant B: /[/model/layers.23/self_attn/MatMul_1]
However, out of the box it can't find the right onnx
file:
Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "transformers.js/scripts/models/phi-1_5_dev/onnx/decoder_model_merged_quantized.onnx".
I see in the @Xenova
repo history that the files were manually renamed; I'll try that too:
mv model.onnx decoder_model_merged.onnx
mv model_quantized.onnx decoder_model_merged_quantized.onnx
mv model.onnx_data decoder_model_merged.onnx_data
I then try to run the model with:
const model = await loadModel('transformers.js/scripts/models/phi-1_5_dev', {
});
const result = await model('Write me a list of numbers:\n', {
});
console.log('result', result);
The model loads, but upon generating I see:
WARNING: Too many inputs were provided (51 > 3). The following inputs will be ignored: "past_key_values.0.key, past_key_values.0.value, past_key_values.1.key, past_key_values.1.value, past_key_values.2.key, past_key_values.2.value, past_key_values.3.key, past_key_values.3.value, past_key_values.4.key, past_key_values.4.value, past_key_values.5.key, past_key_values.5.value, past_key_values.6.key, past_key_values.6.value, past_key_values.7.key, past_key_values.7.value, past_key_values.8.key, past_key_values.8.value, past_key_values.9.key, past_key_values.9.value, past_key_values.10.key, past_key_values.10.value, past_key_values.11.key, past_key_values.11.value, past_key_values.12.key, past_key_values.12.value, past_key_values.13.key, past_key_values.13.value, past_key_values.14.key, past_key_values.14.value, past_key_values.15.key, past_key_values.15.value, past_key_values.16.key, past_key_values.16.value, past_key_values.17.key, past_key_values.17.value, past_key_values.18.key, past_key_values.18.value, past_key_values.19.key, past_key_values.19.value, past_key_values.20.key, past_key_values.20.value, past_key_values.21.key, past_key_values.21.value, past_key_values.22.key, past_key_values.22.value, past_key_values.23.key, past_key_values.23.value".
2024-04-15 11:00:50.956 node[91488:12372370] 2024-04-15 11:00:50.956090 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running Gather node. Name:'/model/layers.0/self_attn/Gather_4' Status Message: indices element out of data bounds, idx=8 must be within the inclusive range [-1,0]
An error occurred during model execution: "Error: Non-zero status code returned while running Gather node. Name:'/model/layers.0/self_attn/Gather_4' Status Message: indices element out of data bounds, idx=8 must be within the inclusive range [-1,0]".
Inputs given to model: [Object: null prototype] {
input_ids: Tensor {
dims: [ 1, 1 ],
type: 'int64',
data: BigInt64Array(1) [ 13n ],
size: 1
},
attention_mask: Tensor {
dims: [ 1, 9 ],
type: 'int64',
data: BigInt64Array(9) [
1n, 1n, 1n, 1n, 1n,
1n, 1n, 1n, 1n
],
size: 9
},
position_ids: Tensor {
dims: [ 1, 1 ],
type: 'int64',
data: BigInt64Array(1) [ 8n ],
size: 1
}
}
node_modules/.pnpm/onnxruntime-node@1.14.0/node_modules/onnxruntime-node/dist/backend.js:45
resolve(__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").run(feeds, fetches, options));
^
Error: Non-zero status code returned while running Gather node. Name:'/model/layers.0/self_attn/Gather_4' Status Message: indices element out of data bounds, idx=8 must be within the inclusive range [-1,0]
at node_modules/.pnpm/onnxruntime-node@1.14.0/node_modules/onnxruntime-node/dist/backend.js:45:108
at process.processTicksAndRejections (node:internal/process/task_queues:77:11)
Node.js v20.12.1
❌ [dev] exited with exit code 1.
❌ 1 script failed.
I'm not entirely sure to proceed from here. Any suggestions? It seems to be something specific to the .onnx
file, as if I replace it with the .onnx
file from the @Xenova
repo it works perfectly.
It looks as though the initial model is missing inputNames
.
The (working) model (@Xenova/phi-1_5_dev
) has:
inputNames: [
'input_ids',
'attention_mask',
'position_ids',
'past_key_values.0.key',
'past_key_values.0.value',
'past_key_values.1.key',
'past_key_values.1.value',
'past_key_values.2.key',
'past_key_values.2.value',
'past_key_values.3.key',
'past_key_values.3.value',
'past_key_values.4.key',
'past_key_values.4.value',
'past_key_values.5.key',
'past_key_values.5.value',
'past_key_values.6.key',
'past_key_values.6.value',
'past_key_values.7.key',
'past_key_values.7.value',
'past_key_values.8.key',
'past_key_values.8.value',
'past_key_values.9.key',
'past_key_values.9.value',
'past_key_values.10.key',
'past_key_values.10.value',
'past_key_values.11.key',
'past_key_values.11.value',
'past_key_values.12.key',
'past_key_values.12.value',
'past_key_values.13.key',
'past_key_values.13.value',
'past_key_values.14.key',
'past_key_values.14.value',
'past_key_values.15.key',
'past_key_values.15.value',
'past_key_values.16.key',
'past_key_values.16.value',
'past_key_values.17.key',
'past_key_values.17.value',
'past_key_values.18.key',
'past_key_values.18.value',
'past_key_values.19.key',
'past_key_values.19.value',
'past_key_values.20.key',
'past_key_values.20.value',
'past_key_values.21.key',
'past_key_values.21.value',
'past_key_values.22.key',
'past_key_values.22.value',
'past_key_values.23.key',
'past_key_values.23.value'
],
Whereas the converted model (susnato/phi-1_5_dev
) is missing the past_key_values
fields:
inputNames: [ 'input_ids', 'attention_mask', 'position_ids' ],
Is there some step in the conversion I'm missing that includes these inputNames
?
@thekevinscott - I am not that experienced in this field but just doing some playing and running into same issue with all of Xenova's models (almost).
Hopefully this might help you here.
@xenova - hoping you see this. Various models you have deployed are running into this issue. (Which I am grateful for your work!)
Hi there 👋 The correct task is text-generation-with-past
(note: -with-past
suffix). So, the command would be:
python3 convert.py --quantize --model_id phi-1_5_dev --task "text-generation-with-past"
@MarketingPip can you provide a list of these models?
Thanks for the discussion and the responses. I've been trying to implement this updated command, but dependencies seem to have shifted since I last posted. I'm trying to move these commands into a Dockerfile
but am now running into new errors.
I have to step away from this but will pick it up later today; maybe it's helpful to share my progress so far.
The main challenges I'm seeing are:
transformers
listed inrequirements.txt
is out of date, and has to be installed from Githuboptimum
listed inrequirements.txt
is out of date, and has to be installed from Github
FROM python:3.9
RUN apt-get update \
&& apt-get install -y \
less \
vim \
git \
git-lfs \
# enable h5py wheels
libhdf5-dev
RUN git lfs install
WORKDIR /code
RUN git clone https://github.com/xenova/transformers.js.git
WORKDIR /code/transformers.js/scripts
RUN git clone https://huggingface.co/susnato/phi-1_5_dev
RUN python3 -m pip install -r requirements.txt
# /usr/local/lib/python3.9/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
# torch.utils._pytree._register_pytree_node(
# Traceback (most recent call last):
# File "/code/transformers.js/scripts/convert.py", line 545, in <module>
# main()
# File "/code/transformers.js/scripts/convert.py", line 340, in main
# config = AutoConfig.from_pretrained(model_id, **from_pretrained_kwargs)
# File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 1039, in from_pretrained
# config_class = CONFIG_MAPPING[config_dict["model_type"]]
# File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 734, in __getitem__
# raise KeyError(key)
# KeyError: 'phi'
RUN python3 -m pip install git+https://github.com/huggingface/transformers.git@df53c6e5d9245315c741ba6cce1e026d4ca104c5
# Traceback (most recent call last):
# File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1530, in _get_module
# return importlib.import_module("." + module_name, self.__name__)
# File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
# return _bootstrap._gcd_import(name[level:], package, level)
# File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
# File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
# File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
# File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
# File "<frozen importlib._bootstrap_external>", line 850, in exec_module
# File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
# File "/usr/local/lib/python3.9/site-packages/optimum/exporters/onnx/__main__.py", line 32, in <module>
# from .convert import export_models, validate_models_outputs
# File "/usr/local/lib/python3.9/site-packages/optimum/exporters/onnx/convert.py", line 48, in <module>
# from transformers.pytorch_utils import is_torch_less_than_1_11
# ImportError: cannot import name 'is_torch_less_than_1_11' from 'transformers.pytorch_utils' (/usr/local/lib/python3.9/site-packages/transformers/pytorch_utils.py)
#
# The above exception was the direct cause of the following exception:
#
# Traceback (most recent call last):
# File "/code/transformers.js/scripts/convert.py", line 16, in <module>
# from optimum.exporters.onnx import main_export, export_models
# File "<frozen importlib._bootstrap>", line 1055, in _handle_fromlist
# File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1520, in __getattr__
# module = self._get_module(self._class_to_module[name])
# File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1532, in _get_module
# raise RuntimeError(
# RuntimeError: Failed to import optimum.exporters.onnx.__main__ because of the following error (look up to see its traceback):
# cannot import name 'is_torch_less_than_1_11' from 'transformers.pytorch_utils' (/usr/local/lib/python3.9/site-packages/transformers/pytorch_utils.py)
RUN python3 -m pip install git+https://github.com/huggingface/optimum.git@b3ecb6c405b7fd5425d79483fd7dc88c0609be8e
RUN python3 convert.py --quantize --model_id phi-1_5_dev --task "text-generation-with-past"
This last step fails with:
Traceback (most recent call last):
File "/code/transformers.js/scripts/convert.py", line 545, in <module>
main()
File "/code/transformers.js/scripts/convert.py", line 448, in main
main_export(**export_kwargs)
File "/usr/local/lib/python3.9/site-packages/optimum/exporters/onnx/__main__.py", line 280, in main_export
model = TasksManager.get_model_from_task(
File "/usr/local/lib/python3.9/site-packages/optimum/exporters/tasks.py", line 1951, in get_model_from_task
model = model_class.from_pretrained(model_name_or_path, **kwargs)
File "/usr/local/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3644, in from_pretrained
model, loading_info = load_tf2_checkpoint_in_pytorch_model(
File "/usr/local/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 524, in load_tf2_checkpoint_in_pytorch_model
tf_model_class = getattr(transformers, tf_model_class_name)
File "/usr/local/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1503, in __getattr__
raise AttributeError(f"module {self.__name__} has no attribute {name}")
AttributeError: module transformers has no attribute TFPhiForCausalLM
I'll pick up the investigation thread later this week. Thanks for all the help and input so far!
@xenova - I have ran into this using
TinyLlama-1.1B-Chat-v1.0
Qwen1.5-0.5B-Chat
and those are just a few to list.
@thekevinscott - I am assuming you are using a version of Python lower than 3.8? If so may I advise upgrading / using an upgraded environment for running Transformers (Python)?
Seem's you are running into issue models on-top of just general Transformer's errors that should be solved via a Torch & Python upgrade (as far as I know).
Cheers
Edit: seen you are using 3.9. Try upgrading / purging both Torch + Transformers.
I've landed on a working implementation here:
https://github.com/thekevinscott/reproducing-phi-1-5-conversion
This appears to convert Phi 1.5 successfully from the original repository.
To summarize the issues I ran into along the way:
- The task to use is
text-generation-with-past
. I don't see this documented anywhere (other than this thread)? - Some dependencies in
requirements.txt
are out of date, or maybe just waiting on a publish to PyPI? - The output model's
onnx
files need to be manually renamed. I don't believe this is documented anywhere.
It would be amazing if the Hugging Face model cards contained some of this information on the necessary steps to reproduce model conversions - that way more people could help contributing new models for this awesome library.
Cheers to both of you for your help. I'll leave this issue open since it sounds like @MarketingPip also has some issues, but feel free to close since my original query is now solved.