Cinnamon/kotaemon

[BUG] LightRag - LLM call error

Closed this issue · 2 comments

Description

Hi,
I am getting error when adding documents to LightRag index, it is a large file and gets broken down into 250 chunks of 1200 token each. However LLM call returned an error when handling Entity/Relationship extraction at chunk 197. Because of that, the indexing fail and I had to reindex, then it failed again at chunk 155. See error below.

Indexing [1/1]: document sample.txt
=> Converting document sample.txt to text
=> Converted document sample.txt to text
=> [document sample.txt] Processed 1 chunks
=> Finished indexing document sample.txt
[GraphRAG] Creating index... This can take a long time.
[GraphRAG] Indexed 0 / 1 documents.
Error: Error code: 500 - {'type': 'error', 'error': {'type': 'api_error', 'message': 'Internal server error'}}

Log info

Extracting entities from chunks: 80%|█████████████████████████████████▋ | 197/246 [1:17:33<19:17, 23.62s/chunk]
INFO:lightrag:Writing graph with 0 nodes, 0 edges

I have seen the exact same issue when using LightRag directly, and the solution was to add retry in front of the llm_model_func.

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=4, max=10),
    retry=retry_if_exception_type((RateLimitError, APIConnectionError, Timeout, APIError)),
)

Can this retry be added to the Kotaemon? I am wasting a lot of Anthropic tokens on retrying indexing the same documents :)

Reproduction steps

1. Go to 'Files'
2. Click on 'LightRag'
3. Upload a large documents.

Screenshots

![DESCRIPTION](LINK.png)

Logs

Extracting entities from chunks:  80%|█████████████████████████████████▋        | 197/246 [1:17:33<19:17, 23.62s/chunk]
INFO:lightrag:Writing graph with 0 nodes, 0 edges

Browsers

No response

OS

No response

Additional information

No response

Hi, thanks for the report this seem to be a valid problem. Can you help to create a PR to include this fix?

All right. I created the PR #572