Support streaming as part of thread runs and LLM generations

Question

multipletwigs opened this issue 6 months ago · 0 comments

Problem Statement

Based on a previous PR #9, we have managed to introduce the concept of thread runs, where we await for a response based on the content of the thread.
We should have the option to stream the answer back to the consumer of the api to accommodate for the slow response time of LLMs.