infiniflow/ragflow

[Bug]: Knowledge Graph [ERROR]Generate embedding error:'title_tks'

Closed this issue · 3 comments

Is there an existing issue for the same bug?

  • I have checked the existing issues.

RAGFlow workspace code commit ID

0d5486a(v0.14.1~75) full

RAGFlow image version

0d5486a(v0.14.1~75) full - self build

Other environment information

install environment : ubuntu 24
ragflow version: 0d5486aa(v0.14.1~75) full

Actual behavior

Knowledge Graph embedding Error. use embedding model include: ollama nomic-embed-text 、 Baichuan-Text-Embedding 、zhipu embedding-3. use LLM : DeepSeek . All reported errors as shown in the picture

image

Expected behavior

reported errors as shown in the picture

image

log info

2024-12-06 16:34:12,414 INFO     15 set_progress(39a518c0b3ac11ef886c0242ac120006), progress: -1, progress_msg: Page(1~100000001): [ERROR]Generate embedding error:'title_tks'
2024-12-06 16:34:12,441 ERROR    15 Generate embedding error:'title_tks'
Traceback (most recent call last):
  File "/ragflow/rag/svr/task_executor.py", line 416, in do_handle_task
    token_count, vector_size = embedding(chunks, embedding_model, task_parser_config, progress_callback)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ragflow/rag/svr/task_executor.py", line 276, in embedding
    tts.append(rmSpace(d["title_tks"]))
                       ~^^^^^^^^^^^^^
KeyError: 'title_tks'
2024-12-06 16:34:12,443 ERROR    15 handle_task got exception for task {"id": "39a518c0b3ac11ef886c0242ac120006", "doc_id": "5d3f2834b3a811ef8f060242ac120006", "from_page": 0, "to_page": 100000000, "retry_count": 0, "kb_id": "f03207d8b38e11efa8d90242ac120006", "parser_id": "knowledge_graph", "parser_config": {"auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": false}, "chunk_token_num": 8192, "delimiter": "\\n!?;\u3002\uff1b\uff01\uff1f", "layout_recognize": true, "html4excel": false, "entity_types": ["organization", "person", "location", "event", "time"], "pages": []}, "name": "hd.txt", "type": "doc", "location": "hd.txt", "size": 1826, "tenant_id": "169257b6b38d11ef8a960242ac120006", "language": "Chinese", "embd_id": "nomic-embed-text:latest@Ollama", "pagerank": 0, "img2txt_id": "qwen-vl-max@Tongyi-Qianwen", "asr_id": "paraformer-realtime-8k-v1@Tongyi-Qianwen", "llm_id": "deepseek-chat@DeepSeek", "update_time": 1733473775470}
Traceback (most recent call last):
  File "/ragflow/rag/svr/task_executor.py", line 463, in handle_task
    do_handle_task(task)
  File "/ragflow/rag/svr/task_executor.py", line 416, in do_handle_task
    token_count, vector_size = embedding(chunks, embedding_model, task_parser_config, progress_callback)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ragflow/rag/svr/task_executor.py", line 276, in embedding
    tts.append(rmSpace(d["title_tks"]))
                       ~^^^^^^^^^^^^^
KeyError: 'title_tks'

Steps to reproduce

- select Knowledge Graph method
- config embedding model:ollama nomic-embed-text 、 Baichuan-Text-Embedding 、zhipu embedding-3
- Parsing document 
- triggered error

Additional information

No response

我也遇到了这个问题,一开始我以为是文件过大,后来传了个200字的小docx也会报这个错

#3875 I had the same problem, it gave me an added feature error, I rolled back the task_executor file and it no longer had the error.

Fixed by #3931