mostlygeek/llama-swap

On-demand model switching with llama.cpp (or other OpenAI compatible backends)

GoMIT

Issues

[Bug/Incompatibility] Continue.dev autocompletion is broken
#32 opened a day ago
3
Some observations and a question about error messages
#30 opened 4 days ago
3
problems unloading the model
#28 opened 5 days ago
3
Add Homebrew Support for macOS
#26 opened 8 days ago
2
[Bug] TTL starts counting from the beginning of a request instead of the end
#25 opened 12 days ago
2
No 'Access-Control-Allow-Origin' header present on the '/v1/models' endpoint response causes the browser to block the request due to CORS policy
#24 opened 18 days ago
3
Change profile model name to use a : (colon)
#21 opened 20 days ago
0
[Feature] Queue requests
#19 opened 21 days ago
5
List in llama.cpp readme
#18 opened 23 days ago
0
[Feature] Support llama.cpp cache_prompt parameter
#16 opened a month ago
3
Support more API end points.
#12 opened a month ago
3
Proxy does not set content length.
#11 opened a month ago
5
[enhancement] Automatically unload idle model
#10 opened a month ago
5
Return errors as chat completion instead of HTTP errors
#9 opened a day ago
1
Support routing to multiple backends
#7 opened 6 days ago
1
Container execution
#5 opened 2 months ago
7
Support the v1/embedding endpoint
#4 opened 4 days ago
0
Handle crashing child processes better
#3 opened 2 months ago
0
make more robust against <defunct> and crashing child processes
#1 opened 2 months ago
1