[BUG] NanoGraphRAG TypeError: 'NoneType' object is not subscriptable
vap0rtranz opened this issue · 4 comments
Description
I'm running NanoGraphRAG via docker against local Ollama. I pulled the main-full
docker image this morning ... it was tagged as 10 days old.
I run this image with USE_NANO_GRAPHRAG=true
but saw an error, so I installed Nano via the Readme.
Simple reasoning Chats with File Collection backed by Ollama work. The Information Panel is populated with indexed docs.
Chat with NanoGraphRAG Collection errors out in the UI. I've pasted the log below. The error ends with:
TypeError: 'NoneType' object is not subscriptable
What else can I check for finding out the root cause of this?
Reproduction steps
1. Go to Chat->NanoGraphRAG
2. Click on a Search in Files->select a file
3. Switch to Chat window, and prompt for "Please summarize"
4. See error
Screenshots
![DESCRIPTION](LINK.png)
Logs
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/embeddings "HTTP/1.1 200 OK"
GraphRAG embedding dim 768
INFO:nano-graphrag:Load KV full_docs with 0 data
INFO:nano-graphrag:Load KV text_chunks with 0 data
INFO:nano-graphrag:Load KV llm_response_cache with 8 data
INFO:nano-graphrag:Load KV community_reports with 0 data
INFO:nano-graphrag:Loaded graph from /app/ktem_app_data/user_data/files/nano_graphrag/580ad20d-9321-4e5c-9c93-707181e1976c/input/graph_chunk_entity_relation.graphml with 1 nodes, 0 edges
INFO:nano-vectordb:Load (1, 768) data
INFO:nano-vectordb:Init {'embedding_dim': 768, 'metric': 'cosine', 'storage_file': '/app/ktem_app_data/user_data/files/nano_graphrag/580ad20d-9321-4e5c-9c93-707181e1976c/input/vdb_entities.json'} 1 data
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/embeddings "HTTP/1.1 200 OK"
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 575, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1923, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
prediction = await utils.async_iteration(iterator)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 663, in async_iteration
return await iterator.__anext__()
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 656, in __anext__
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 639, in run_sync_iterator_async
return next(iterator)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 801, in gen_wrapper
response = next(iterator)
File "/app/libs/ktem/ktem/pages/chat/__init__.py", line 812, in chat_fn
for response in pipeline.stream(chat_input, conversation_id, chat_history):
File "/app/libs/ktem/ktem/reasoning/simple.py", line 741, in stream
docs, infos = self.retrieve(message, history)
File "/app/libs/ktem/ktem/reasoning/simple.py", line 517, in retrieve
retriever_docs = retriever_node(text=query)
File "/usr/local/lib/python3.10/site-packages/theflow/base.py", line 1097, in __call__
raise e from None
File "/usr/local/lib/python3.10/site-packages/theflow/base.py", line 1088, in __call__
output = self.fl.exec(func, args, kwargs)
File "/usr/local/lib/python3.10/site-packages/theflow/backends/base.py", line 151, in exec
return run(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/theflow/middleware.py", line 144, in __call__
raise e from None
File "/usr/local/lib/python3.10/site-packages/theflow/middleware.py", line 141, in __call__
_output = self.next_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/theflow/middleware.py", line 117, in __call__
return self.next_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/theflow/base.py", line 1017, in _runx
return self.run(*args, **kwargs)
File "/app/libs/ktem/ktem/index/file/graph/nano_pipelines.py", line 385, in run
entities, relationships, reports, sources = asyncio.run(
File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "/app/libs/ktem/ktem/index/file/graph/nano_pipelines.py", line 158, in nano_graph_rag_build_local_query_context
use_text_units = await _find_most_related_text_unit_from_entities(
File "/usr/local/lib/python3.10/site-packages/nano_graphrag/_op.py", line 772, in _find_most_related_text_unit_from_entities
all_text_units = truncate_list_by_token_size(
File "/usr/local/lib/python3.10/site-packages/nano_graphrag/_utils.py", line 74, in truncate_list_by_token_size
tokens += len(encode_string_by_tiktoken(key(data)))
File "/usr/local/lib/python3.10/site-packages/nano_graphrag/_op.py", line 774, in <lambda>
key=lambda x: x["data"]["content"],
TypeError: 'NoneType' object is not subscriptable
Browsers
Firefox
OS
Linux
Additional information
I've tried with 2 different PDF files.
Hmm, I see another error in the UI. Under Files->Upload Info windows, after I upload new Files for Nano, I see this error at the bottom of the window:
[GraphRAG] Creating index... This can take a long time.
[GraphRAG] Indexed 0 / 648 documents.
Error: EmptyNetworkError
The docker runtime terminal of the Kotaemon app does not have this kind of error. Here is what I see at the end of its output after uploading files for Nano:
Would you like me to extract more entities?
--------------------------------------------------
⠼ Processed 4 chunks, 12 entities(duplicated), 0 relations(duplicated)
INFO:nano-graphrag:Inserting 10 vectors to entities
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/embeddings "HTTP/1.1 200 OK"
INFO:nano-graphrag:[Community Report]...
INFO:nano-graphrag:Writing graph with 10 nodes, 0 edges
Hi, we have that kind of error too. There is a FAQ about it https://github.com/gusye1234/nano-graphrag/blob/main/docs/FAQ.md
Hi, we have that kind of error too. There is a FAQ about it https://github.com/gusye1234/nano-graphrag/blob/main/docs/FAQ.md
Interesting. I was using the default llama3.1:8b. I'll retest with qwen2.5:14b.
Hi, we have that kind of error too. There is a FAQ about it https://github.com/gusye1234/nano-graphrag/blob/main/docs/FAQ.md
OK, I re-tested with Qwen and the EmptyNetworkError went away.
A different error appears in the UI after indexing, but the file does look to be indexed. Below is the snippet.
What is your setup?
I'm curious if you have setup the NanoGraphCollection differently than me.
Indexing [1/1]: IPCC_AR6_SYR_LongerReport.pdf
=> Converting IPCC_AR6_SYR_LongerReport.pdf to text
=> Converted IPCC_AR6_SYR_LongerReport.pdf to text
=> [IPCC_AR6_SYR_LongerReport.pdf] Processed 136 chunks
=> Finished indexing IPCC_AR6_SYR_LongerReport.pdf
Error: name 'EmbeddingFunc' is not defined