Configurable Inference / Embedding Model & rate limit

Question

Configurable Inference / Embedding Model & rate limit

aymenfurter opened this issue a year ago · 1 comments

Introduce an environment variable for model selection and timeouts. This change will allow users to choose different LLM models, including free options, enhancing flexibility.

Answer 1 · 2023-12-24T13:35:00.000Z

how about something like:

OPENAI_EMBEDDING=text-embedding-ada-002
OPENAI_MODEL=gpt-4-1106-preview

I am currently putting these environment variables in a file named .env in the root directory (next to main.py). Then using dotenv module to load the environment variables. I removed the .env file and dotenv seems to silently fail, so if someone does not have a .env file (or chooses not to use a .env file, or a person sets environment variables some other way), things still seems to work.

I decided to purchase some tokens on openai so that I can use the more advanced models (like gpt-4). People that do not want to purchase openai tokens can try it out for free by setting the OPENAI_MODEL variable to gpt-3.5-turbo. That is what I used to test out some things before purchasing.