/llama-flask

Flask API for Effortless Text Generation

Primary LanguagePython

LlamašŸ¦™GenāœØ

A Flask-based API for generating text using the Llama C++ library. This API provides a simple interface for generating text with Llama's pre-trained language models. It accepts an input text and returns the generated output in JSON format.

Features

  • Built with Flask, a lightweight and easy-to-use web framework.
  • Uses the Llama C++ library for efficient text generation.
  • Handles JSON requests and responses for easy integration with other services.
  • Configurable model path and default number of tokens.
  • Includes error handling, logging, and environment variable support for model paths.

Usage:

  1. Clone this repository and navigate to the project directory.
  2. Install requirements: pip install -r requirements.txt
  3. Set the environment variable LLAMA_MODEL_PATH to the path of your Llama model, if desired. Otherwise, you can download the ggml-alpaca-7b-q4.bin and add it in the models folder. It will be used by default.
  4. Run the main.py file using the command python main.py. The API will start running at http://127.0.0.1:5000/.
  5. To call the API, send a POST request with JSON data containing an 'input' key with your desired input text to the /generate endpoint.

Example JSON request data:

{
    "input": "What is the capital of France?"
}

Example cURL command:

curl -X POST \
  http://127.0.0.1:5000/generate \
  -H 'Content-Type: application/json' \
  -d '{"input": "What is the capital of France?"}'

The API will process the input and return the generated output in JSON format.