truera/trulens

[BUG]get concurrent.futures._base.TimeoutError

rushmj opened this issue ยท 5 comments

when I run:
with tru_rag as recording:
for question in eval_questions:
response = vector_rag.query(question)

I get "concurrent.futures._base.TimeoutError"

Hey @rushmj, good to see you around here again! Hope you're doing well.

The concurrent.futures._base.TimeoutError you're encountering is likely due to asynchronous operations not completing within their allotted time frame. This error is common in asynchronous programming when a future or task does not finish executing before its timeout expires.

Given the context of your code, there are a few areas where this issue might arise, particularly in the asynchronous methods main_acall, achat, aquery, astream_chat, and aquery_with_record. These methods involve asynchronous calls to the underlying llama_index library's methods like aquery and achat, which are used to perform queries or chat operations asynchronously.

When you run queries in a loop, especially if you're awaiting responses from an external service or database that llama_index interacts with, it's possible that the responses take longer than expected. If the execution time exceeds the default or specified timeout for these asynchronous operations, a TimeoutError will be raised.

To address this issue, consider the following approaches:

  1. Increase Timeout Duration: If the timeout is configurable in the llama_index library or in your asynchronous task setup, increasing the timeout duration might help. This gives each query more time to complete before the timeout is triggered. In the trulens library, the timeout duration for the asynchronous operations is configurable by the user. This is evident from the post method definition in the DummyEndpoint class, where the timeout parameter is optional and can be specified by the user when calling the method. If not provided, it defaults to DEFAULT_NETWORK_TIMEOUT.

  2. Optimize Query Performance: If the queries themselves are slow, optimizing the queries or the data they operate on could reduce response times and help avoid timeouts.

  3. Concurrency Control: If you're sending too many concurrent requests in your loop, this could overwhelm the server or the service handling the queries, leading to slower response times. Implementing concurrency control by limiting the number of concurrent asynchronous tasks might help manage load and prevent timeouts.

  4. Error Handling: Implement error handling for TimeoutError to manage and possibly retry the operation in case of a timeout. This can be part of a broader strategy to make your application more resilient to such errors.

Without more specific details about your loop and how the asynchronous operations are managed (e.g., timeout settings, concurrency management), these suggestions are based on common practices for dealing with TimeoutError in asynchronous Python code. If you could provide more details about your setup, I might be able to give more specific advice.

For more information, you can refer to the trulens_eval/trulens_eval/tru_llama.py and trulens_eval/trulens_eval/feedback/provider/endpoint/base.py files in the trulens repository.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a ๐Ÿ‘ if this is helpful and ๐Ÿ‘Ž if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@rushmj can you share more info including the stack trace and reproducible code example?

Run of run in <Thread(TP.submit with debug timeout_2, started 138842688448064)> timed out after 600.0 second(s).
None
exception calling callback for <Future at 0x7e43a8a7c0a0 state=finished raised TimeoutError>
Traceback (most recent call last):
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 342, in _invoke_callbacks
callback(self)
File "/home/guest/.local/lib/python3.10/site-packages/trulens_eval/app.py", line 1168, in _add_future_feedback
res = future_result.result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/guest/.local/lib/python3.10/site-packages/trulens_eval/utils/python.py", line 374, in _future_target_wrapper
return func(*args, **kwargs)
File "/home/guest/.local/lib/python3.10/site-packages/trulens_eval/utils/threading.py", line 168, in _run_with_timeout
raise e
File "/home/guest/.local/lib/python3.10/site-packages/trulens_eval/utils/threading.py", line 159, in _run_with_timeout
res: T = fut.result(timeout=timeout)
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 460, in result
raise TimeoutError()
concurrent.futures._base.TimeoutError

Hi @rushmj , can you share how you structured the app you are recording here?

@rushmj - please let us know if you're still facing this issue. If so, please share your app setup. Closing for now.