qdrant/fastembed

Trying to download model from huggingface manually and want to use it from local path instead of download from HF

visheshgitrepo opened this issue · 10 comments

Downloading model i.e, model.onnx and tokenizer.json, vocab.txt files from huggingface.

Now I want to pass this local path and dont want fastembed to download from HF. trying below but its not working out but still trying to connect internet.

cache_path = "./embedding_model_s3"
this cache_path consists of 3 files mentioned above
TextEmbedding(local_files_only=True,model_name="sentence-transformers/all-MiniLM-L6-v2",cache_dir=cache_path)

Is this right way to do

Hi @visheshgitrepo, what is exactly the issue?

It checks the revision of the model via internet, then if it exists in the local dir, it read it from there, otherwise it downloads it from hf hub.

Btw, you're passing local_file_only=True but it is not supported yet

Im using Fastembed for my application with Guardrails.
Where I pass engine and model as "fastembed" and "all-MiniLM-L6-v2", this will call fastembed and passes this model. As it by default downloads from huggingface. I dont want the fastembed to download from huggingface. I just want to pass model path which consists of model files and use it for embeddings. Something like below..

cache_path = "./sentence-transformers/all-MiniLM-L6-v2"
embedding_model = TextEmbedding(model_name=cache_path)

image

How did you download your model?

when fastembed downloads a model it saves it under cache_dir with the following dir structure:

<cache_dir>/<models-<repo>-<model-name>/

e.g. if you're using

TextEmbedding(model_name='sentence-transformers/all-MiniLM-L6-v2', cache_dir='./model_cache')

the full path to the model will be

./model_cache/models--qdrant--all-MiniLM-L6-v2-onnx/

Hi @joein ,

I am facing the same issue,

My System Setup:: Windows Server from the client side with restriction to hugging face.

I am trying to use hybrid search from fastembed , but due to restricting in the client side to download models directly, I need to download the models (Splade and AllMinilm) and ship them from my local.
Now even if I provide the custom path to my cache_dir, it also fails to load the model as it seems like it tries to check the latest version of the model from the net every time.

To fix this in the client network I changed the True here in your code from the Python environment lib folder where qdrant client was installed and it started to work.

local_files_only=kwargs.get("local_files_only", False) - 120
https://github.com/qdrant/fastembed/blob/main/fastembed/common/model_management.py

Please give us the feature to load the embedding model folder path directly from the local, as most of the users will have the restriction for downloading the model every time.

Hi @jai8004

local_files_only option is available as of fastembed==0.2.7

It seems that you're already using local_files_only, thus you're on fastembed 0.2.7

However, you don't need to change the default value to false, you just need to initialize your embedding with passing local_files_only as a keyword-argument, e.g. TextEmbedding(local_files_only=True)

Closing it as it has already been implemented, feel free to create a discussion or create a new issue if something does not work

mark

visheshgitrepo How did you solve your issue. I am facing same issue while using guardrails.

This is not working for me, it still attempts to download from huggingface, even with local_files_only=True, and cache_dir set to where the model files are. We have to approve models first, therefore no direct HF downloading, why does it still reach out??

I think it's because it's not the HF cache format, but the model files themselves, is there no way to load them from a path?