Support streaming as part of thread runs and LLM generations
multipletwigs opened this issue · 0 comments
multipletwigs commented
Problem Statement
- Based on a previous PR #9, we have managed to introduce the concept of
thread
runs, where we await for a response based on the content of the thread. - We should have the option to stream the answer back to the consumer of the api to accommodate for the slow response time of LLMs.