huggingface/transformers

Moving in a folder & `push_to_hub` for a `trust_remote_code=True` model

lbourdois opened this issue ยท 6 comments

Feature request

Bonjour !

I'm opening an issue following a discussion with Lysandre on Slack.

My request is to be able to do .. in model repositories on HF (where currently you can only do .). On this point, I don't know if this applies to all template directories or only to customs, which then require a trust_remote_code=True to load them.

A second request is that when you have a custom model (i.e. loadable via trust_remote_code=True) and once it's finetuned, that the push_to_hub function pushes all the files needed for the model to function properly, not just config.json, configuration.py, model.safetensors, special_tokens_map.json, tokenizer.json, tokenizer_config.json and training_args.bin.

Motivation

The concrete case behind my requests.

We recently extended Flash Attention to the T5.

So we had to develop a custom implementation and to load our pre-trained models for finetuning, we have to do :

from transformers import AutoModel
model = AutoModel.from_pretrained("CATIE-AQ/FAT5-base-UL2-fr", trust_remote_code=True)

For this to work, we need a modeling file.

In our code on GitHub (https://github.com/catie-aq/flashT5/blob/main/src/model/modeling_flash_t5.py), we call up classes that we've put in a utils folder and import them, for example (line 47) a ..utils.positional_encoding import ALiBiPositionalEncoding, RelativePositionalEncoding, RotaryPositionalEncoding.
On HF, this returned an error saying that there was a .. in the modeling_flash_t5.py code and that it was therefore not possible to retrieve the classes. We therefore had to move all the code contained in the utils folder to the root.

image

The line I used as an example above then becomes from .positional_encoding import ALiBiPositionalEncoding, RelativePositionalEncoding, RotaryPositionalEncoding and it works.

So being able to use classes contained in files would be appreciated ๐Ÿ˜„

The second request is related to the fact that, once this model has been finetuned, I do a push_to_hub to save the weights.
This pushes me the files config.json, configuration_flash_t5.py, model.safetensors, special_tokens_map.json, tokenizer.json, tokenizer_config.json and training_args.bin.
And when I then want to reload the model to do inference, it tells me that the 8 files circled in red in the image above + the modeling_flash_t5.py file are missing.

So every time I finetune, I have to do a second push where I add these 9 missing files so that my model can load properly.

Wouldn't it be possible for these files (which are detected during model loading) to be pushed directly with the 1st push_to_hub? ๐Ÿค—

Your contribution

Let me know if there's any way I can help.

Thanks @lbourdois ! Pinging @Rocketknight1 on it as soon as some bandwidth frees up!

Hm, this is challenging! I'm not sure how the model could autodetect which files should be pushed. I guess it would need to

  1. Inspect the Python code being pushed
  2. Look for imports
  3. Detect which of those are local imports
  4. Add those files to the commit

Alternatively, we could just use simpler heuristics, maybe push_to_hub for remote code models should also push other .py files in the directory?

That heuristic seems a little dangerous, though, but I think code inspection could be tricky too. What do you think?

Heuristics sound good, but I'm probably not experienced enough to identify the potential dangers this could create.

Wouldn't it be possible to store the name of all the files that are downloaded from HF when trust_remote_code=True and then when push_to_hub push back exactly those same files except for the weights and the config file, which will therefore be modified after finetuning?

Some backstory: a lot of the dynamic module loading code was written by Sylvain Gugger, who has since left HF, and so no-one really owns it right now! I'm probably the closest thing we have to the maintainer of those files, but I don't know all of it.

I did some investigation, and the relevant code is here. This is the code that is used when loading a pretrained model with trust_remote_code=True - it figures out what is imported, and this is used to decide what other files need to be downloaded.

I think it should be possible to refactor all of this to support both of your requests, but I'll have to do some deep dives into the codebase to make it all work. Basically, we would rework this so it could handle imports from local folders, and secondly we would use the same methods to scan the files when we upload them with push_to_hub, to figure out what other files need to go with them. This way we could avoid heuristics. WDYT?

It's not as trivial as I thought if knowledge has been lost.
What you propose seems to answer both requests.
I thank you in advance for investigating things and for the digging still necessary ๐Ÿ™

Cool, I'll see if I can make it work! And don't stress about the extra work - we need to have someone take ownership of it again, so this is as good an opportunity as any for me to start figuring out the code there.