langflow-ai/langflow

ValueError: Error performing search in AstraDBVectorStore: 'content'

Closed this issue · 5 comments

Bug Description

Hello,

When trying to setup the RAG template with AstraDB, we get the following:

ValueError: Error performing search in AstraDBVectorStore: 'content'

It looks like it's trying to get a key that doesn't exist.

Traceback (most recent call last):

                               File "<frozen runpy>", line 198, in _run_module_as_main
                               File "<frozen runpy>", line 88, in _run_code

                               File "C:\Python311\Scripts\langflow.exe\__main__.py", line 7, in <module>

                               File "C:\Python311\Lib\site-packages\langflow\__main__.py", line 528, in main
                                 app()
                                 -> <typer.main.Typer object at 0x000002684A448DD0>
                               File "C:\Python311\Lib\site-packages\typer\main.py", line 321, in __call__
                                 return get_command(self)(*args, **kwargs)
                                        |           |      |       -> {}
                                        |           |      -> ()
                                        |           -> <typer.main.Typer object at 0x000002684A448DD0>
                                        -> <function get_command at 0x000002682A6B8680>
                               File "C:\Python311\Lib\site-packages\click\core.py", line 1157, in __call__
                                 return self.main(*args, **kwargs)
                                        |    |     |       -> {}
                                        |    |     -> ()
                                        |    -> <function TyperGroup.main at 0x000002682A69E480>
                                        -> <TyperGroup >
                               File "C:\Python311\Lib\site-packages\typer\core.py", line 728, in main
                                 return _main(
                                        -> <function _main at 0x000002682A69D620>
                               File "C:\Python311\Lib\site-packages\typer\core.py", line 197, in _main
                                 rv = self.invoke(ctx)
                                      |    |      -> <click.core.Context object at 0x000002684A408F90>
                                      |    -> <function MultiCommand.invoke at 0x00000268270756C0>
                                      -> <TyperGroup >
                               File "C:\Python311\Lib\site-packages\click\core.py", line 1688, in invoke
                                 return _process_result(sub_ctx.command.invoke(sub_ctx))
                                        |               |       |       |      -> <click.core.Context object
                             at 0x000002684A965F50>
                                        |               |       |       -> <function Command.invoke at
                             0x0000026827075080>
                                        |               |       -> <TyperCommand run>
                                        |               -> <click.core.Context object at 0x000002684A965F50>
                                        -> <function MultiCommand.invoke.<locals>._process_result at
                             0x000002684ABD6160>
                               File "C:\Python311\Lib\site-packages\click\core.py", line 1434, in invoke
                                 return ctx.invoke(self.callback, **ctx.params)
                                        |   |      |    |           |   -> {'host': '127.0.0.1', 'workers':
                             1, 'timeout': 300, 'port': 7860, 'components_path':
                             WindowsPath('C:/Python311/Lib/site-packa...
                                        |   |      |    |           -> <click.core.Context object at
                             0x000002684A965F50>
                                        |   |      |    -> <function run at 0x000002684ABD5E40>
                                        |   |      -> <TyperCommand run>
                                        |   -> <function Context.invoke at 0x00000268270679C0>
                                        -> <click.core.Context object at 0x000002684A965F50>
                               File "C:\Python311\Lib\site-packages\click\core.py", line 783, in invoke
                                 return __callback(*args, **kwargs)
                                                    |       -> {'host': '127.0.0.1', 'workers': 1,
                             'timeout': 300, 'port': 7860, 'components_path':
                             WindowsPath('C:/Python311/Lib/site-packa...
                                                    -> ()
                               File "C:\Python311\Lib\site-packages\typer\main.py", line 703, in wrapper
                                 return callback(**use_params)
                                        |          -> {'host': '127.0.0.1', 'workers': 1, 'timeout': 300,
                             'port': 7860, 'components_path': WindowsPath('C:/Python311/Lib/site-packa...
                                        -> <function run at 0x000002684ABD5300>
                               File "C:\Python311\Lib\site-packages\langflow\__main__.py", line 189, in run
                                 process = run_on_windows(host, port, log_level, options, app)
                                           |              |     |     |          |        ->
                             <fastapi.applications.FastAPI object at 0x000002682A45CF90>
                                           |              |     |     |          -> {'bind':
                             '127.0.0.1:7860', 'workers': 1, 'timeout': 300}
                                           |              |     |     -> 'debug'
                                           |              |     -> 7860
                                           |              -> '127.0.0.1'
                                           -> <function run_on_windows at 0x000002684ABD5440>
                               File "C:\Python311\Lib\site-packages\langflow\__main__.py", line 232, in
                             run_on_windows
                                 run_langflow(host, port, log_level, options, app)
                                 |            |     |     |          |        ->
                             <fastapi.applications.FastAPI object at 0x000002682A45CF90>
                                 |            |     |     |          -> {'bind': '127.0.0.1:7860',
                             'workers': 1, 'timeout': 300}
                                 |            |     |     -> 'debug'
                                 |            |     -> 7860
                                 |            -> '127.0.0.1'
                                 -> <function run_langflow at 0x000002684ABD5940>
                               File "C:\Python311\Lib\site-packages\langflow\__main__.py", line 354, in
                             run_langflow
                                 uvicorn.run(
                                 |       -> <function run at 0x000002684B677420>
                                 -> <module 'uvicorn' from
                             'C:\\Python311\\Lib\\site-packages\\uvicorn\\__init__.py'>
                               File "C:\Python311\Lib\site-packages\uvicorn\main.py", line 577, in run
                                 server.run()
                                 |      -> <function Server.run at 0x000002684B6777E0>
                                 -> <uvicorn.server.Server object at 0x000002684B5B7290>
                               File "C:\Python311\Lib\site-packages\uvicorn\server.py", line 65, in run
                                 return asyncio.run(self.serve(sockets=sockets))
                                        |       |   |    |             -> None
                                        |       |   |    -> <function Server.serve at 0x000002684B677880>
                                        |       |   -> <uvicorn.server.Server object at 0x000002684B5B7290>
                                        |       -> <function _patch_asyncio.<locals>.run at
                             0x000002684C6E7560>
                                        -> <module 'asyncio' from
                             'C:\\Python311\\Lib\\asyncio\\__init__.py'>
                               File "C:\Python311\Lib\asyncio\runners.py", line 190, in run
                                 return runner.run(main)
                                        |      |   -> <coroutine object Server.serve at 0x000002684AB79B70>
                                        |      -> <function Runner.run at 0x000002682949AA20>
                                        -> <asyncio.runners.Runner object at 0x000002684AC14410>
                               File "C:\Python311\Lib\asyncio\runners.py", line 118, in run
                                 return self._loop.run_until_complete(task)
                                        |    |     |                  -> <Task pending name='Task-1'
                             coro=<Server.serve() running at
                             C:\Python311\Lib\site-packages\uvicorn\server.py:69> wait_for=<Fu...
                                        |    |     -> <function _patch_loop.<locals>.run_until_complete at
                             0x000002684C6E7920>
                                        |    -> <ProactorEventLoop running=True closed=False debug=False>
                                        -> <asyncio.runners.Runner object at 0x000002684AC14410>
                               File "C:\Python311\Lib\asyncio\base_events.py", line 637, in
                             run_until_complete
                                 self.run_forever()
                                 |    -> <function _patch_loop.<locals>.run_forever at 0x000002684C6E7880>
                                 -> <ProactorEventLoop running=True closed=False debug=False>
                               File "C:\Python311\Lib\asyncio\windows_events.py", line 321, in run_forever
                                 super().run_forever()
                               File "C:\Python311\Lib\asyncio\base_events.py", line 604, in run_forever
                                 self._run_once()
                                 |    -> <function _patch_loop.<locals>._run_once at 0x000002684C6E79C0>
                                 -> <ProactorEventLoop running=True closed=False debug=False>
                               File "C:\Python311\Lib\site-packages\nest_asyncio.py", line 133, in _run_once
                                 handle._run()
                                 |      -> <function Handle._run at 0x000002682940B7E0>
                                 -> <Handle Task.__wakeup(<Future finis...026801F806D0>>)>
                               File "C:\Python311\Lib\asyncio\events.py", line 80, in _run
                                 self._context.run(self._callback, *self._args)
                                 |    |            |    |           |    -> <member '_args' of 'Handle'
                             objects>
                                 |    |            |    |           -> <Handle Task.__wakeup(<Future
                             finis...026801F806D0>>)>
                                 |    |            |    -> <member '_callback' of 'Handle' objects>
                                 |    |            -> <Handle Task.__wakeup(<Future finis...026801F806D0>>)>
                                 |    -> <member '_context' of 'Handle' objects>
                                 -> <Handle Task.__wakeup(<Future finis...026801F806D0>>)>
                               File "C:\Python311\Lib\asyncio\tasks.py", line 350, in __wakeup
                                 self.__step()
                                 -> <Task pending name='Task-937' coro=<build_flow.<locals>._build_vertex()
                             running at C:\Python311\Lib\site-packages\langflow\ap...
                               File "C:\Python311\Lib\asyncio\tasks.py", line 267, in __step
                                 result = coro.send(None)
                                          |    -> <method 'send' of 'coroutine' objects>
                                          -> <coroutine object build_flow.<locals>._build_vertex at
                             0x0000026861F390A0>
                               File "C:\Python311\Lib\site-packages\langflow\api\v1\chat.py", line 219, in
                             _build_vertex
                                 vertex_build_result = await graph.build_vertex(
                                                             |     -> <function Graph.build_vertex at
                             0x000002684A39C680>
                                                             -> Graph Representation:
                                                                ----------------------
                                                                Vertices (11):
                                                                  ChatInput-3HVe5,
                             AstraVectorStoreComponent-ySu3U, ParseData-rYl...
                               File "C:\Python311\Lib\site-packages\langflow\graph\graph\base.py", line
                             1332, in build_vertex
                                 await vertex.build(
                                       |      -> <function Vertex.build at 0x000002684A38FE20>
                                       -> Vertex(display_name=Astra DB, id=AstraVectorStoreComponent-ySu3U,
                             data={'description': 'Implementation of Vector Store using ...
                               File "C:\Python311\Lib\site-packages\langflow\graph\vertex\base.py", line
                             797, in build
                                 await step(user_id=user_id, event_manager=event_manager, **kwargs)
                                       |            |                      |                ->
                             {'fallback_to_env_vars': False}
                                       |            |                      ->
                             <langflow.events.event_manager.EventManager object at 0x0000026860C44910>
                                       |            -> UUID('145754d6-7d55-480c-bac8-5a5e233c06ca')
                                       -> <bound method Vertex._build of Vertex(display_name=Astra DB,
                             id=AstraVectorStoreComponent-ySu3U, data={'description': 'Implem...
                               File "C:\Python311\Lib\site-packages\langflow\graph\vertex\base.py", line
                             475, in _build
                                 await self._build_results(
                                       |    -> <function Vertex._build_results at 0x000002684A38FA60>
                                       -> Vertex(display_name=Astra DB, id=AstraVectorStoreComponent-ySu3U,
                             data={'description': 'Implementation of Vector Store using ...
                             > File "C:\Python311\Lib\site-packages\langflow\graph\vertex\base.py", line
                             694, in _build_results
                                 result = await initialize.loading.get_instance_results(
                                                |          |       -> <function get_instance_results at
                             0x0000026848918F40>
                                                |          -> <module
                             'langflow.interface.initialize.loading' from
                             'C:\\Python311\\Lib\\site-packages\\langflow\\interface\\initialize\\loa...
                                                -> <module 'langflow.interface.initialize' from
                             'C:\\Python311\\Lib\\site-packages\\langflow\\interface\\initialize\\__init__.p
                             y'>
                               File
                             "C:\Python311\Lib\site-packages\langflow\interface\initialize\loading.py", line
                             64, in get_instance_results
                                 return await build_component(params=custom_params,
                             custom_component=custom_component)
                                              |                      |                               ->
                             <langflow.utils.validate.AstraVectorStoreComponent object at
                             0x0000026864CC6510>
                                              |                      -> {'embedding':
                             OllamaEmbeddings(base_url='http://localhost:11434',
                             model='jina/jina-embeddings-v2-base-en', embed_instruction=...
                                              -> <function build_component at 0x000002684A38DE40>
                               File
                             "C:\Python311\Lib\site-packages\langflow\interface\initialize\loading.py", line
                             151, in build_component
                                 build_results, artifacts = await custom_component.build_results()
                                                                  |                -> <function
                             Component.build_results at 0x000002684A38D760>
                                                                  ->
                             <langflow.utils.validate.AstraVectorStoreComponent object at
                             0x0000026864CC6510>
                               File
                             "C:\Python311\Lib\site-packages\langflow\custom\custom_component\component.py",
                             line 617, in build_results
                                 return await self._build_with_tracing()
                                              |    -> <function Component._build_with_tracing at
                             0x000002684A38D620>
                                              -> <langflow.utils.validate.AstraVectorStoreComponent object
                             at 0x0000026864CC6510>
                               File "C:\Python311\Lib\contextlib.py", line 222, in __aexit__
                                 await self.gen.athrow(typ, value, traceback)
                                       |    |   |      |    |      -> <traceback object at
                             0x00000268011631C0>
                                       |    |   |      |    -> ValueError("Error performing search in
                             AstraDBVectorStore: 'content'")
                                       |    |   |      -> <class 'ValueError'>
                                       |    |   -> <method 'athrow' of 'async_generator' objects>
                                       |    -> <async_generator object TracingService.trace_context at
                             0x0000026864918A40>
                                       -> <contextlib._AsyncGeneratorContextManager object at
                             0x0000026801363F50>
                               File "C:\Python311\Lib\site-packages\langflow\services\tracing\service.py",
                             line 229, in trace_context
                                 raise e
                               File "C:\Python311\Lib\site-packages\langflow\services\tracing\service.py",
                             line 226, in trace_context
                                 yield self
                                       -> <langflow.services.tracing.service.TracingService object at
                             0x0000026864D6ED50>
                               File
                             "C:\Python311\Lib\site-packages\langflow\custom\custom_component\component.py",
                             line 605, in _build_with_tracing
                                 _results, _artifacts = await self._build_results()
                                                              |    -> <function Component._build_results at
                             0x000002684A38D800>
                                                              ->
                             <langflow.utils.validate.AstraVectorStoreComponent object at
                             0x0000026864CC6510>
                               File
                             "C:\Python311\Lib\site-packages\langflow\custom\custom_component\component.py",
                             line 640, in _build_results
                                 result = method()
                                          -> <bound method AstraVectorStoreComponent.search_documents of
                             <langflow.utils.validate.AstraVectorStoreComponent object at 0x00...
                               File "<string>", line 279, in search_documents

                             ValueError: Error performing search in AstraDBVectorStore: 'content'

                             ╭───────────────────── Traceback (most recent call last) ─────────────────────╮
                             │ in search_documents:277                                                     │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langchain_core\vectorstores\base.py:337 in   │
                             │ search                                                                      │
                             │                                                                             │
                             │    334 │   │   │   │   "mmr", or "similarity_score_threshold".              │
                             │    335 │   │   """                                                          │
                             │    336 │   │   if search_type == "similarity":                              │
                             │ ❱  337 │   │   │   return self.similarity_search(query, **kwargs)           │
                             │    338 │   │   elif search_type == "similarity_score_threshold":            │
                             │    339 │   │   │   docs_and_similarities = self.similarity_search_with_rele │
                             │    340 │   │   │   │   query, **kwargs                                      │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langchain_astradb\vectorstores.py:967 in     │
                             │ similarity_search                                                           │
                             │                                                                             │
                             │    964 │   │   """                                                          │
                             │    965 │   │   return [                                                     │
                             │    966 │   │   │   doc                                                      │
                             │ ❱  967 │   │   │   for (doc, _, _) in self.similarity_search_with_score_id( │
                             │    968 │   │   │   │   query=query,                                         │
                             │    969 │   │   │   │   k=k,                                                 │
                             │    970 │   │   │   │   filter=filter,                                       │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langchain_astradb\vectorstores.py:1025 in    │
                             │ similarity_search_with_score_id                                             │
                             │                                                                             │
                             │   1022 │   │   │   )                                                        │
                             │   1023 │   │                                                                │
                             │   1024 │   │   embedding_vector = self._get_safe_embedding().embed_query(qu │
                             │ ❱ 1025 │   │   return self.similarity_search_with_score_id_by_vector(       │
                             │   1026 │   │   │   embedding=embedding_vector,                              │
                             │   1027 │   │   │   k=k,                                                     │
                             │   1028 │   │   │   filter=filter,                                           │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langchain_astradb\vectorstores.py:1107 in    │
                             │ similarity_search_with_score_id_by_vector                                   │
                             │                                                                             │
                             │   1104 │   │   │   )                                                        │
                             │   1105 │   │   │   raise ValueError(msg)                                    │
                             │   1106 │   │   sort = {"$vector": embedding}                                │
                             │ ❱ 1107 │   │   return self._similarity_search_with_score_id_by_sort(        │
                             │   1108 │   │   │   sort=sort,                                               │
                             │   1109 │   │   │   k=k,                                                     │
                             │   1110 │   │   │   filter=filter,                                           │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langchain_astradb\vectorstores.py:1129 in    │
                             │ _similarity_search_with_score_id_by_sort                                    │
                             │                                                                             │
                             │   1126 │   │   │   include_similarity=True,                                 │
                             │   1127 │   │   │   sort=sort,                                               │
                             │   1128 │   │   )                                                            │
                             │ ❱ 1129 │   │   return [                                                     │
                             │   1130 │   │   │   (                                                        │
                             │   1131 │   │   │   │   self.document_encoder.decode(hit),                   │
                             │   1132 │   │   │   │   hit["$similarity"],                                  │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langchain_astradb\vectorstores.py:1131 in    │
                             │ <listcomp>                                                                  │
                             │                                                                             │
                             │   1128 │   │   )                                                            │
                             │   1129 │   │   return [                                                     │
                             │   1130 │   │   │   (                                                        │
                             │ ❱ 1131 │   │   │   │   self.document_encoder.decode(hit),                   │
                             │   1132 │   │   │   │   hit["$similarity"],                                  │
                             │   1133 │   │   │   │   hit["_id"],                                          │
                             │   1134 │   │   │   )                                                        │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langchain_astradb\utils\encoders.py:144 in   │
                             │ decode                                                                      │
                             │                                                                             │
                             │   141 │   @override                                                         │
                             │   142 │   def decode(self, astra_document: dict[str, Any]) -> Document:     │
                             │   143 │   │   return Document(                                              │
                             │ ❱ 144 │   │   │   page_content=astra_document["content"],                   │
                             │   145 │   │   │   metadata=astra_document["metadata"],                      │
                             │   146 │   │   )                                                             │
                             │   147                                                                       │
                             ╰─────────────────────────────────────────────────────────────────────────────╯
                             KeyError: 'content'

                             The above exception was the direct cause of the following exception:

                             ╭───────────────────── Traceback (most recent call last) ─────────────────────╮
                             │ C:\Python311\Lib\site-packages\langflow\graph\vertex\base.py:694 in         │
                             │ _build_results                                                              │
                             │                                                                             │
                             │   691 │                                                                     │
                             │   692 │   async def _build_results(self, custom_component, custom_params,   │
                             │       fallback_to_env_vars=False):                                          │
                             │   693 │   │   try:                                                          │
                             │ ❱ 694 │   │   │   result = await initialize.loading.get_instance_results(   │
                             │   695 │   │   │   │   custom_component=custom_component,                    │
                             │   696 │   │   │   │   custom_params=custom_params,                          │
                             │   697 │   │   │   │   vertex=self,                                          │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langflow\interface\initialize\loading.py:64  │
                             │ in get_instance_results                                                     │
                             │                                                                             │
                             │    61 │   │   if base_type == "custom_components":                          │
                             │    62 │   │   │   return await build_custom_component(params=custom_params, │
                             │       custom_component=custom_component)                                    │
                             │    63 │   │   elif base_type == "component":                                │
                             │ ❱  64 │   │   │   return await build_component(params=custom_params,        │
                             │       custom_component=custom_component)                                    │
                             │    65 │   │   else:                                                         │
                             │    66 │   │   │   raise ValueError(f"Base type {base_type} not found.")     │
                             │    67                                                                       │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langflow\interface\initialize\loading.py:151 │
                             │ in build_component                                                          │
                             │                                                                             │
                             │   148 ):                                                                    │
                             │   149 │   # Now set the params as attributes of the custom_component        │
                             │   150 │   custom_component.set_attributes(params)                           │
                             │ ❱ 151 │   build_results, artifacts = await custom_component.build_results() │
                             │   152 │                                                                     │
                             │   153 │   return custom_component, build_results, artifacts                 │
                             │   154                                                                       │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langflow\custom\custom_component\component.p │
                             │ y:617 in build_results                                                      │
                             │                                                                             │
                             │   614 │   │   self,                                                         │
                             │   615 │   ):                                                                │
                             │   616 │   │   if self._tracing_service:                                     │
                             │ ❱ 617 │   │   │   return await self._build_with_tracing()                   │
                             │   618 │   │   return await self._build_without_tracing()                    │
                             │   619 │                                                                     │
                             │   620 │   async def _build_results(self):                                   │
                             │                                                                             │
                             │ C:\Python311\Lib\contextlib.py:222 in __aexit__                             │
                             │                                                                             │
                             │   219 │   │   │   │   # tell if we get the same exception back              │
                             │   220 │   │   │   │   value = typ()                                         │
                             │   221 │   │   │   try:                                                      │
                             │ ❱ 222 │   │   │   │   await self.gen.athrow(typ, value, traceback)          │
                             │   223 │   │   │   except StopAsyncIteration as exc:                         │
                             │   224 │   │   │   │   # Suppress StopIteration *unless* it's the same excep │
                             │   225 │   │   │   │   # was passed to throw().  This prevents a StopIterati │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langflow\services\tracing\service.py:229 in  │
                             │ trace_context                                                               │
                             │                                                                             │
                             │   226 │   │   │   yield self                                                │
                             │   227 │   │   except Exception as e:                                        │
                             │   228 │   │   │   self._end_traces(trace_id, trace_name, e)                 │
                             │ ❱ 229 │   │   │   raise e                                                   │
                             │   230 │   │   finally:                                                      │
                             │   231 │   │   │   asyncio.create_task(await asyncio.to_thread(self._end_and │
                             │       trace_name, None))                                                    │
                             │   232                                                                       │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langflow\services\tracing\service.py:226 in  │
                             │ trace_context                                                               │
                             │                                                                             │
                             │   223 │   │   │   component._vertex,                                        │
                             │   224 │   │   )                                                             │
                             │   225 │   │   try:                                                          │
                             │ ❱ 226 │   │   │   yield self                                                │
                             │   227 │   │   except Exception as e:                                        │
                             │   228 │   │   │   self._end_traces(trace_id, trace_name, e)                 │
                             │   229 │   │   │   raise e                                                   │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langflow\custom\custom_component\component.p │
                             │ y:605 in _build_with_tracing                                                │
                             │                                                                             │
                             │   602 │   │   inputs = self.get_trace_as_inputs()                           │
                             │   603 │   │   metadata = self.get_trace_as_metadata()                       │
                             │   604 │   │   async with self._tracing_service.trace_context(self, self.tra │
                             │       metadata):                                                            │
                             │ ❱ 605 │   │   │   _results, _artifacts = await self._build_results()        │
                             │   606 │   │   │   self._tracing_service.set_outputs(self.trace_name, _resul │
                             │   607 │   │                                                                 │
                             │   608 │   │   return _results, _artifacts                                   │
                             │                                                                             │
                             │ C:\Python311\Lib\site-packages\langflow\custom\custom_component\component.p │
                             │ y:640 in _build_results                                                     │
                             │                                                                             │
                             │   637 │   │   │   │   │   │   _results[output.name] = output.value          │
                             │   638 │   │   │   │   │   │   result = output.value                         │
                             │   639 │   │   │   │   │   else:                                             │
                             │ ❱ 640 │   │   │   │   │   │   result = method()                             │
                             │   641 │   │   │   │   │   │   # If the method is asynchronous, we need to a │
                             │   642 │   │   │   │   │   │   if inspect.iscoroutinefunction(method):       │
                             │   643 │   │   │   │   │   │   │   result = await result                     │
                             │ in search_documents:279                                                     │
                             ╰─────────────────────────────────────────────────────────────────────────────╯
                             ValueError: Error performing search in AstraDBVectorStore: 'content'

Reproduction

  1. Setup a database collection of type vector with Jina integration (model: jina-embeddings-v2-base-en).
  2. Load data into the collection.
  3. Create a langflow RAG AstraDB template project with OLlama Embeddings (same model as (1)).
  4. The AstraDB step crashes.

Expected behavior

Data should be returned from AstraDB.

Who can help?

No response

Operating System

Windows

Langflow Version

Latest

Python Version

3.11

Screenshot

No response

Flow File

No response

Hello, @sabatale! I'm here to help you with any bugs, questions, or contributions you have. Let's get to work!

To resolve the "ValueError: Error performing search in AstraDBVectorStore: 'content'" error, you need to ensure that the documents retrieved from the AstraDB collection contain the "content" field. The error suggests that the code is trying to access the "content" field in the documents, but it is not present.

Here's the relevant part of the code that processes the documents:

def _astradb_collection_to_data(self, collection):
    data = []
    data_dict = collection.find()
    if data_dict and "data" in data_dict:
        data_dict = data_dict["data"].get("documents", [])

    for item in data_dict:
        data.append(Data(content=item["content"]))
    return data

To fix this, you should verify that the documents in your AstraDB collection have the "content" field. If they do not, you need to modify the documents to include this field or adjust the code to handle documents without the "content" field.

Here is a modified version of the _astradb_collection_to_data method that includes a check for the "content" field:

def _astradb_collection_to_data(self, collection):
    data = []
    data_dict = collection.find()
    if data_dict and "data" in data_dict:
        data_dict = data_dict["data"].get("documents", [])

    for item in data_dict:
        if "content" in item:
            data.append(Data(content=item["content"]))
        else:
            logger.warning(f"Document missing 'content' field: {item}")
    return data

This modification will log a warning if a document is missing the "content" field and skip adding it to the data list. This should prevent the "ValueError" from being raised due to missing "content" fields [1].

To continue talking to Dosu, mention @dosu.

Hi @sabatale,

Thank you for your feedback! It would be great if you could share the Flow.json file for debugging.

Regarding the issue, I tested with Ollama using JINA embeddings and AstraDB VectorStore, and it works as expected. Could you please provide more details about your data ingestion pipeline? Is the data ingested through Langflow?

For reference, I have attached a screenshot of the working flow.

Vector Store RAG Ollama.json
Screenshot 2024-09-30 at 11 32 07 PM

Also, if the ingestion is not done through Langflow, please ensure that the Vector DB has the required fields. A sample data point from AstraDB ingested using Langflow will have the following structure:

Top-Level Keys:

  • _id: A string representing the unique identifier.
  • content: A string containing textual content.
  • $vector: An array of numerical values.
  • metadata: An object containing additional information.

Keys within metadata:

  • source: A string indicating the source URL.
  • title: A string representing the title.
  • language: A string specifying the language code.

Overall JSON Structure:

{
  "_id": "string",
  "content": "string",
  "$vector": [number, number, ...],
  "metadata": {
    "source": "string",
    "title": "string",
    "language": "string"
  }
}

Please ensure your data matches this structure so that it can be processed correctly. Let me know if you have any questions or need further assistance.

Could this information about the required DataStax schema for LangFlow please be added to the documentation?

This situation can be managed if the ingest pipeline utilizes the Langflow AstraDB component. I will create a separate issue to add this to the documentation. @cystema Thank you for the feedback.

@edwinjosechittilappilly Thanks for the template! It differs from the one we get when loading data outside of Langflow, which causes the error:

"_id": ""
"type": "CompositeElement"
"text": ""
"$vector": ""