ai-ollama-py is a Flask server built with Python 3.10 that processes JSON payloads containing context
and prompt
strings. The server interacts with Ollama on port 11434 and exposes three endpoints:
/api/admin_engine
/api/arc_engine
/api/api_engine
The server expects a JSON payload in the following format:
{
"context": "your context here",
"prompt": "your prompt here"
}
-
Python 3.10: Ensure you have Python 3.10 installed. You can download it from the official Python website.
-
Ollama: Ollama is a dependency for this project. It's recommended to run Ollama on a machine with a GPU for better performance, but there is also a Dockerized option.
-
Download Ollama:
curl -fsSL https://ollama.com/install.sh | sh
-
Run Ollama: After installation, start the Ollama server by running the following command in your terminal:
ollama serve
-
Pull Llama3:
ollama pull llama3
*Pulling llama3 for the first time may take a few minutes
**Please note that ollama runs best on a machine with a dedicated GPU, you may run this on a CPU only machine but expect slower response times
- Use python 3.10
- Run
python -m pip install -r requirements.txt
python main.py
query example
curl -X POST -H "Content-Type: application/json" -d '{"context": "block ui","prompt": "How can i create a volume using the ui"}' http://0.0.0.0:8087/api/admin_engine
export INGEST_DIR=/home/danduh/dev/mist-portal-nrwl/apps/portal-e2e/
python3 main.py
----
OLLAMA_HOST=0.0.0.0 ollama serve
{
"model": "gpt-4-turbo",
"temperature": 0.1,
"messages": [
{
"role": "system",
"content": ""
},
{
"role": "user",
"content": "Create Page Object class with name . \n INPUT>>: <<INPUT \n undefined"
}
]
}