Migrate from `rustformes/llm` to `Candle`
Closed this issue · 1 comments
Spin relies on rustformers/llm
. The library has now been archived and no longer actively maintained. The usage of that library has also constrained us to use only models in the ggml
format which is not popular anymore and most models are not distributed in that format.
I suggest that we migrate to candle
which will allow us to be able to use models in the much more popular safetensors
format. This migration will also unlock the ability to use newer models like LLama3(.1).
The current implementation supports a few models as listed as features. I think we can begin with only supporting Llama models for inferencing first. We can then optionally add support for other models as desired and feature-gate it to keep the binary size in check.
I would love to hear your thoughts!
I like the idea of starting with Llama and feature-gating others as people express interest.