kermitt2/delft

[suggestion] add transformer name / embedding names in the model name

Closed this issue · 3 comments

I'm thinking that could be useful to have an additional information in the model name, such as the transformer name or the embeddings name. We could implement a sort of name compression (e.g. removing the slashes in the huggingface names to make it more readable).

The motivation would be to allow to cohesist of models trained with different transformers/embeddings

What do you think?

It was my first idea, I added very early the embeddings name in the model name, then removed it.

The problem is that we can have several embeddings, and in the future several transformers in the same architecture. They all can be combined so it will be unreadable and it will be hard to have canonical names with so many pieces.
Daniel also raised the point that there was too much "magic" in the way the path to a model was solved (using its name) and we should avoid using the name to "guess" attributes of the model - which is a very good point, and there are still a few places where the name is used (I try to remove it).

On the other hand, the library supports arbitrary name, there is only managed in the applications/ part, so it does not prevent a user of the library to change that.

I see. We can drop this then, one user can specify the name and use it's own convention. We can assume that suffixing the architecture would be the only "trick" that is done to the name

Suffixing the architecture name in the convention is the delft/applications/ usage of the core library, but the idea in this core library should be to have 0 trick I think, no more magic.