Assertion Error when running the server and client locally
Opened this issue · 8 comments
Hello! I am currently using vectordb in a personal project and wanted to get the server and client running. I started the server as per the instruction in the README, similarly with the client. The code I use is below
# server.py
from docarray import DocList
import numpy as np
from vectordb import InMemoryExactNNVectorDB, HNSWVectorDB
from docarray import BaseDoc
from docarray.typing import NdArray
class ToyDoc(BaseDoc):
text: str = ''
embedding: NdArray[128]
# Specify your workspace path
db = InMemoryExactNNVectorDB[ToyDoc](workspace='./workspace_path')
# Index a list of documents with random embeddings
doc_list = [ToyDoc(text=f'toy doc {i}', embedding=np.random.rand(128)) for i in range(1000)]
db.index(inputs=DocList[ToyDoc](doc_list))
with db.serve(protocol='grpc', port=12345, replicas=1, shards=1) as service:
service.block()
# client.py
from docarray import BaseDoc
from docarray.typing import NdArray
class ToyDoc(BaseDoc):
text: str = ''
embedding: NdArray[128]
from vectordb import Client
# Instantiate a client connected to the server. In practice, replace 0.0.0.0 to the server IP address.
client = Client[ToyDoc](address='grpc://0.0.0.0:12345')
# Perform a search query
results = client.search(inputs=DocList[ToyDoc]([query]), limit=10)
However when I run the server and the client I get an AssertionError from the server
# Last line of server output
assert len(docs) == len(matched_documents) == len(matched_scores)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
Here is the full error from the server
ERROR indexer/rep-0@28106 AssertionError() [08/06/24 13:52:39]
add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/runtimes/worker/request_handling.py", line 1106,
in process_data
result = await self.handle(
^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/runtimes/worker/request_handling.py", line 720,
in handle
return_data = await self._executor.__acall__(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 749, in __acall__
return await self.__acall_endpoint__(req_endpoint, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 881, in
__acall_endpoint__
return await exec_func(
^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 839, in exec_func
return await get_or_reuse_loop().run_in_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/jina/serve/executors/decorators.py", line 325, in
arg_wrapper
return fn(executor_instance, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/vectordb/db/executors/inmemory_exact_indexer.py", line 54,
in search
return self._search(docs, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/woc/lib/python3.11/site-packages/vectordb/db/executors/inmemory_exact_indexer.py", line 42,
in _search
assert len(docs) == len(matched_documents) == len(matched_scores)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
Is there something I am missing? Or is this a known issue?
Is there anything indexed?
Is there anything indexed?
Things should be indexed. From my understanding that's what db.index does in server.py ?
Is there any documentation for this library outside of the README in the GitHub repo?
if u did not index any documentz there will be none.
There is no other documentation except for this repo.
if u did not index any documentz there will be none.
There is no other documentation except for this repo.
But I did index some documents. In the server.py file I have this line
db.index(inputs=DocList[ToyDoc](doc_list))
To the best of my knowledge this is the indexing operation?
ok missed that. Will check there. Can you sharr the list of versions for vectordb, jina and docarray dependencies?
vectordb==0.0.21
docarray==0.40.0
numpy==1.26.1
orjson==3.10.6
pydantic==1.10.17
rich==13.7.1
types-requests==2.31.0.6
typing-inspect==0.9.0
jina==3.27.2
aiofiles==24.1.0
aiohttp==3.10.1
docarray==0.40.0
docker==7.1.0
fastapi==0.112.0
filelock==3.15.4
grpcio==1.57.0
grpcio-health-checking==1.57.0
grpcio-reflection==1.57.0
jcloud==0.3
jina-hubble-sdk==0.39.0
numpy==1.26.1
opentelemetry-api==1.19.0
opentelemetry-exporter-otlp==1.19.0
opentelemetry-exporter-otlp-proto-grpc==1.19.0
opentelemetry-exporter-prometheus==0.41b0
opentelemetry-instrumentation-aiohttp-client==0.40b0
opentelemetry-instrumentation-fastapi==0.40b0
opentelemetry-instrumentation-grpc==0.40b0
opentelemetry-sdk==1.19.0
packaging==24.1
pathspec==0.12.1
prometheus_client==0.20.0
protobuf==4.25.4
pydantic==1.10.17
python-multipart==0.0.9
PyYAML==6.0.1
requests==2.32.3
urllib3==1.26.19
uvicorn==0.23.1
uvloop==0.19.0
websockets==12.0
Here are all the dependencies for vectordb, jina and docarray. Here are the main versions vectordb==0.0.21, docarray==0.40.0, jina==3.27.2.
What works, is to index
when the server has started, so move the code to index to:
#db.index(inputs=DocList[ToyDoc](doc_list))
with db.serve(protocol='grpc', port=12345, replicas=1, shards=1) as service:
service.index(inputs=DocList[ToyDoc](doc_list))
service.block()
I see it may not be so clear in the docs, but the DB behaves slightly different when it is started as a service or when used as a simple Python object. So you have to use them in a coherent manner