qdrant/fastembed

[Error]: ValueError: Could not download model prithvida/Splade_PP_en_v1 from any source.

syedzaidi-kiwi opened this issue · 3 comments

What happened?

Token will not been saved to git credential helper. Pass add_to_git_credential=True if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful
2024-06-03 14:17:44.104 | ERROR | fastembed.common.model_management:download_model:236 - Could not download model from HuggingFace: 401 Client Error. (Request ID: Root=1-665dd088-700f65d523b7a4746601336c;19ab477d-0eaf-438d-b57f-4afa629b9a37)

Repository Not Found for url: https://huggingface.co/api/models/Qdrant/SPLADE_PP_en_v1/revision/main.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
User Access Token "HybridSearch" is expiredFalling back to other sources.

ValueError Traceback (most recent call last)
in <cell line: 50>()
48 sparse_model_name = "prithvida/Splade_PP_en_v1"
49 dense_model_name = "mixedbread-ai/mxbai-embed-large-v1"
---> 50 sparse_model = SparseTextEmbedding(model_name=sparse_model_name, batch_size=32)
51 dense_model = TextEmbedding(model_name=dense_model_name, batch_size=32)
52

2 frames
/usr/local/lib/python3.10/dist-packages/fastembed/common/model_management.py in download_model(cls, model, cache_dir, **kwargs)
242 return cls.retrieve_model_gcs(model["model"], url_source, str(cache_dir))
243
--> 244 raise ValueError(f"Could not download model {model['model']} from any source.")

ValueError: Could not download model prithvida/Splade_PP_en_v1 from any source.

I have checked the HF_TOKEN, everything is fine. This code was working fine for me, but suddenly it has started giving Errors

What Python version are you on? e.g. python --version

Python 3.11.7

Version

0.2.7 (Latest)

What os are you seeing the problem on?

MacOS

Relevant stack traces and/or logs

No response

Full code below

import numpy as np
from qdrant_client import QdrantClient
from qdrant_client.models import SearchRequest, NamedVector, NamedSparseVector, SparseVector, ScoredPoint
import os
from transformers import AutoTokenizer
import fastembed
from fastembed import SparseEmbedding, SparseTextEmbedding, TextEmbedding
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI




from qdrant_client import QdrantClient
from qdrant_client.models import (
    Distance,
    NamedSparseVector,
    NamedVector,
    SparseVector,
    PointStruct,
    SearchRequest,
    SparseIndexParams,
    SparseVectorParams,
    VectorParams,
    ScoredPoint,
)
from transformers import AutoTokenizer
from huggingface_hub import login


# Login to Hugging Face
login(token="XXX")

# Initialize Qdrant client
qdrant_client = QdrantClient(
    url="XXX",
    api_key="XXX",
)

# Models
sparse_model_name = "prithvida/Splade_PP_en_v1"
dense_model_name = "mixedbread-ai/mxbai-embed-large-v1"
sparse_model = SparseTextEmbedding(model_name=sparse_model_name, batch_size=32)
dense_model = TextEmbedding(model_name=dense_model_name, batch_size=32)

# Hybrid search function
def search(query_text: str, top_k: int = 10):
    query_sparse_vectors: List[SparseEmbedding] = list(sparse_model.embed([query_text]))
    query_dense_vector: List[np.ndarray] = list(dense_model.embed([query_text]))

    search_results = qdrant_client.search_batch(
        collection_name="XXX",
        requests=[
            SearchRequest(
                vector=NamedVector(
                    name="text-dense",
                    vector=query_dense_vector[0],
                ),
                limit=top_k,
                with_payload=True,
            ),
            SearchRequest(
                vector=NamedSparseVector(
                    name="text-sparse",
                    vector=SparseVector(
                        indices=query_sparse_vectors[0].indices.tolist(),
                        values=query_sparse_vectors[0].values.tolist(),
                    ),
                ),
                limit=top_k,
                with_payload=True,
            ),
        ],
    )

    return search_results

# Example usage
query = "What is the projected increase in U.S. wheat supplies for the 2024/25 period compared to 2023/24?"
results = search(query, top_k=1)

def rank_list(search_result: List[ScoredPoint]):
    return [(point.id, rank + 1) for rank, point in enumerate(search_result)]

def rrf(rank_lists, alpha=60, default_rank=1000):
    all_items = set(item for rank_list in rank_lists for item, _ in rank_list)
    item_to_index = {item: idx for idx, item in enumerate(all_items)}
    rank_matrix = np.full((len(all_items), len(rank_lists)), default_rank)
    for list_idx, rank_list in enumerate(rank_lists):
        for item, rank in rank_list:
            rank_matrix[item_to_index[item], list_idx] = rank
    rrf_scores = np.sum(1.0 / (alpha + rank_matrix), axis=1)
    sorted_indices = np.argsort(-rrf_scores)
    sorted_items = [(list(item_to_index.keys())[idx], rrf_scores[idx]) for idx in sorted_indices]
    return sorted_items

dense_rank_list, sparse_rank_list = rank_list(results[0]), rank_list(results[1])
rrf_rank_list = rrf([dense_rank_list, sparse_rank_list])

def find_point_by_id(client: QdrantClient, collection_name: str, rrf_rank_list: List[Tuple[int, float]]):
    return client.retrieve(collection_name=collection_name, ids=[item[0] for item in rrf_rank_list])

retrieved_points = find_point_by_id(qdrant_client, "XXX, rrf_rank_list)

# Print the top results
for point in retrieved_points:
    print(f"Document ID: {point.id}, Text: {point.payload['text'][:2000]}")

Hey @syedzaidi-kiwi. I tried to reproduce using your code snippet. It worked fine for me. Could you please try again?

Also, you don't need to specify a HF token.

Yes, Its working fine for me now. Thank you for your reply.