Ollama now supports embedding models
kasperwelbers opened this issue · 6 comments
Ollama added support for embedding models like BERT. This is much faster than using a generative model, such as llama2, which is currently the default in embed_text.
Changing this default, and perhaps adding documentation to help people pick good embedding models, could make rollama super useful for all sorts of downsteam tasks in R!
Nice! It didn't work until I updated to v0.1.29 (0.1.26 is apparently the minimum). But then nomic-embed-text
was about 4 times faster than the default llama2
model in the embedding vignette example (and the f-means of the resulting model was 0.05 better 😉 ).
I think about the best approach for this. Having one default throughout the package is neat, but models meant for embedding are definitly faster and make more sense for a lot of people. I will at least add it to the vignette and the examples.
It would also be good to add how you can use arbitrary embedding models from huggingface. Not sure if the process is the same for these models as what is documented here: https://github.com/ollama/ollama/blob/main/docs/import.md
[Post removed]
This only worked because I grabbed the wrong modelfile. It's actually more complicated...
Nice, thats really cool!
What is the purpose of Python here? Is this only that it downloads the model? Because then it might also be done with hfhub, which seems to be an effort of team posit to get huggingface to R.
That's exactly what I was looking for! For some reason it didn't show up in my searches and I assumed that I've dreamed it 😅. Yes, the Python stuff was just for downloading the file. Now all we need is a good heuristic to identify the file Ollama wants.
Ok, I was a bit quick with the post above and couldn't reproduce it with the files downloaded through hfhub
. Finally, I noticed I had acidetally grabbed the wrong model file.
You need indeed to first follow the steps to convert the model using convert-hf-to-gguf.py
. And then move the converted bin file to a directory Ollama has access to (in my case inside the container).
So for now, I would tell people to rely on either nomic-embed-text
or all-minilm
and check what might be added in the future.