🖼️ Infra ✨ Feature? ❤️🩹 Bug? 📞 Help?
Since it's compatible with OpenAI Assistants API, this is how you would integrate the client side:
- [2024/01/19] 🔥 Added usage w ollama. Keep reading 👇.
- [2024/01/19] 🔥 Action tool. Let your Assistant make requests to APIs.
- [2023/12/19] 🔥 New example: Open source LLM with code interpreter. Learn more.
- [2023/12/08] 🔥 New example: Open source LLM with function calling. Learn more.
- [2023/11/29] 🔥 New example: Using mistral-7b, an open source LLM. Check it out.
- Code Interpreter: Generate and runs Python code in a sandboxed environment autonomously. (beta)
- Knowledge Retrieval: Retrieves external knowledge or documents autonomously.
- Function Calling: Defines and executes custom functions autonomously.
- Actions: Execute requests to external APIs autonomously.
- Files: Supports a range of file formats.
- Enterprise production-ready:
- observability (metrics, errors, traces, logs, etc.)
- scalability (serverless, caching, autoscaling, etc.)
- security (encryption, authentication, authorization, etc.)
- You operate on a large scale and want to reduce your costs
- You want to increase your speed
- You want to increase customization (e.g. use your own models, extend the API, etc.)
- You work in a data-sensitive environment (healthcare, IoT, military, law, etc.)
- Your product does have poor internet access (military, IoT, extreme environment, etc.)
HAL-9100 is in continuous development, with the aim of always offering better infrastructure for Edge LLMs. To achieve this, it is based on several principles that define its functionality and scope.
Less prompt is more
As few prompts as possible should be hard-coded into the infrastructure, just enough to bridge the gap between Software 1.0 and Software 3.0 and give the client as much control as possible on the prompts.
Edge-first
HAL-9100 does not require internet access by focusing on open source LLMs. Which means you own your data and your models. It runs on a Raspberry PI (LLM included).
OpenAI-compatible
OpenAI spent a large amount of the best brain power to design this API, which makes it an incredible experience for developers. Support for OpenAI LLMs are not a priority at all though.
Reliable and deterministic
HAL-9100 focus on reliability and being as deterministic as possible by default. That's why everything has to be tested and benchmarked.
Flexible
A minimal number of hard-coded prompts and behaviors, a wide range of models, infrastructure components and deployment options and it play well with the open-source ecosystem, while only integrating projects that have stood the test of time.
Get started in less than a minute through GitHub Codespaces:
Or:
git clone https://github.com/stellar-amenities/hal-9100
cd hal-9100
To get started quickly, let's use Anyscale API.
Get an API key from Anyscale. You can get it here. Replace in hal-9100.toml the model_api_key
with your API key.
Usage w/ ollama
- use
model_url = "http://ollama:11434/v1/chat/completions"
- set
gemma:2b
in examples/quickstart.js - and run
docker compose --profile api --profile ollama -f docker/docker-compose.yml up
Install OpenAI SDK: npm i openai
Start the infra:
docker compose --profile api -f docker/docker-compose.yml up
Run the quickstart:
node examples/quickstart.js
Is there a hosted version?
No. HAL-9100 is not a hosted service. It's a software that you can deploy on your infrastructure. We can help you deploy it on your infrastructure. Contact us.
Which LLM API can I use?
Examples of LLM APIs that does support OpenAI API-like, that you can use:
- ollama
- MLC-LLM
- FastChat (good if you have a mac)
- vLLM (good if you have a modern gpu)
- Perplexity API
- Mistral API
- anyscale
- together ai
We recommend these models:
- mistralai/Mixtral-8x7B-Instruct-v0.1
- mistralai/mistral-7b
Other models have not been extensively tested and may not work as expected, but you can try them.