Skynet is an API server for AI services wrapping several apps and models.
It is comprised of specialized modules which can be enabled or disabled as needed.
- Summary and Action Items with llama.cpp (enabled by default)
- Live Transcriptions with Faster Whisper via websockets
- 🚧 More to follow
- Poetry
- Redis
# Init and update submodules if you haven't already. This will add llama.cpp which provides the OpenAI api server
git submodule update --init
# Download the preferred GGUF llama model
mkdir "$HOME/models"
wget -q --show-progress "https://huggingface.co/jitsi/Llama-3-8B-Instruct-GGUF/resolve/main/llama-3-8b-instruct-Q4_K_M.gguf?download=true" -O "$HOME/models/llama-3-8b-instruct.Q4_K_M.gguf"
export LLAMA_PATH="$HOME/models/llama-3-8b-instruct.Q4_K_M.gguf"
export OPENAI_API_SERVER_PATH="$HOME/skynet/llama.cpp/server"
# start Redis
docker run -d --rm -p 6379:6379 redis
# disable authorization (for testing)
export BYPASS_AUTHORIZATION=1
poetry install
./run.sh
# open http://localhost:8000/summaries/docs in a browser
mkdir -p "$HOME/models/streaming-whisper"
export WHISPER_MODEL_NAME="tiny.en"
export BYPASS_AUTHORIZATION="true"
export ENABLED_MODULES="streaming_whisper"
export WHISPER_MODEL_PATH="$HOME/models/streaming-whisper"
poetry install
./run.sh
docker compose -f compose-dev.yaml up --build
docker cp $HOME/models/llama-3-8b-instruct-Q8_0.gguf skynet-web-1:/models
docker restart skynet-web-1
# localhost:8000 for Skynet APIs
# localhost:8001/metrics for Prometheus metrics
# localhost:8003 for llama.cpp web server GUI
Go to Streaming Whisper Demo to test your deployment from a browser
OR
Go to demos/streaming-whisper/ and start a Python http server.
python3 -m http.server 8080
Open http://127.0.0.1:8080.
Detailed documentation on how to configure, run, build and monitor Skynet and can be found in the docs.
If you want to contribute, make sure to install the pre-commit hook for linting.
poetry run githooks setup
Skynet is distributed under the Apache 2.0 License.