This repo contains information of how to stream the responses of a fine tune LLM, with the help of Fast API. For more information on this please go through this medium article
Go to the file src/fast_trial.py to understand, how streaming works in Fast API
Go to the file src/fast_llm.py to understand, how streaming works in case of LLM and Fast API
Go to the file src/fast_langchain.py for streaming using langchain. OpenAI model was used in this. A custom handler was created in src/handlers.py
The following commands are to be run in command line
cd src
# For running trial
uvicorn fast_trial:app
# For running local llm streaming
uvicorn fast_llm:app
# For running Langchain Streaming
uvicorn fast_langchain:app
For testing please take a look at src/stream_test.py.
You can also take a look at the notebook notebooks/stream_test.py
For any questions of support, please create an issue or mail to jaswanth04@gmail.com