hyperonym/basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.

PythonMIT

Pinned issues

Add support for chat completion API

#140 opened 2 years ago by peakji

Open1

Support for `v1/embeddings` endpoint

#179 opened 2 years ago by josephrocca

Closed12

Issues

401 error on llama2 model while access granted
#289 opened a year ago by tomtomtomtom44
0
Runpod Serverless
#288 opened a year ago by stonejohnson
1
Loading basaran on multiple gpus leads to error
#280 opened a year ago by tanmaylaud
0
Can you create a Gradio app for this Trained model
#276 opened a year ago by yhammadmufin
0
Suggestion: Passthrough to OpenAI GPT3.5 for testing
#273 opened a year ago by FetchFast
0
Add support for chat completion API
#140 opened 2 years ago by peakji
1
Tried multiple different models but get "The model weights are not tied..." error every time..
#266 opened a year ago by jontstaz
0
How to send Audio Inputs to the Basaran
#234 opened a year ago by Tushar-ml
3
Strong need for multiple `models` in a single deployment
#263 opened a year ago by KastanDay
3
Llama 2 models not working - how to pass auth token?
#232 opened a year ago by arsaboo
1
TypeError: issubclass() arg 1 must be a class
#253 opened a year ago by gsuuon
1
:latest version tag
#138 opened 2 years ago by mariushosting
2
Use basaran API as Langchain LLM
#256 opened a year ago by brightebyte
0
RuntimeError: expected scalar type Float but found Half
#215 opened 2 years ago by DataDropp
0
TypeError: __init__() got an unexpected keyword argument 'load_in_4bit'
#227 opened a year ago by tanshuai
1
Error when Running Vicuna's FastChat Model without GPU
#223 opened a year ago by davyeu
1
FR support for using fine tuned models that use Peft
#221 opened a year ago by samos123
0
Langchain Prompt Format
#198 opened 2 years ago by 0xDigest
5
I want use the function prefix_allowed_tokens_fn, where of basaran's source code shall I modify?
#220 opened a year ago by zoubaihan
1
Falcon 40B : too slow and random answers
#204 opened 2 years ago by ArnaudHureaux
7
GPTQ & 4bit
#180 opened 2 years ago by olihough86
3
crash when running mosaicml/mpt-7b-* models: KeyError: 'attention_mask'
#213 opened 2 years ago by tarasglek
2
QLoRa support
#202 opened 2 years ago by bitnom
4
concurrent request supported?
#205 opened 2 years ago by hudengjunai
1
Support for `v1/embeddings` endpoint
#179 opened 2 years ago by josephrocca
12
in stream mode, the English word has no space after detokenizer and Chinese were messed up
#197 opened 2 years ago by lucasjinreal
14
Vicuna problem
#160 opened 2 years ago by zhound420
13
Docker run runs and then exits, does not set up server
#183 opened 2 years ago by handrew
2
How to set my own parameters in model.generate() in basaran?
#193 opened 2 years ago by zoubaihan
2
Inference should stop if connection is aborted/closed
#192 opened 2 years ago by josephrocca
2
Define chat history format using jinja template
#141 opened 2 years ago by peakji
2
Do you have Discord community?
#185 opened 2 years ago by karfly
1
`v1/completions` does not include `data: ` prefix when `stream:true`
#144 opened 2 years ago by josephrocca
5
How can I use this with a project using OpenAI's nodejs library?
#120 opened 2 years ago by MarkSchmidty
2
ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.
#139 opened 2 years ago by josephrocca
1
How to pass `max_token_length` to `load_model` ?
#116 opened 2 years ago by MohamedAliRashad
1
Support ARM Docker images
#110 opened 2 years ago by WillBeebe
1
how to run model in total offline?
#109 opened 2 years ago by gitknu
4
Support for llama.cpp/ggml models
#107 opened 2 years ago by codito
2
RuntimeError: mat1 and mat2 shapes cannot be multiplied
#181 opened 2 years ago by lcw99
1
Getting error for model when using vicuna model
#152 opened 2 years ago by djaffer
5
Possible to run on M-series chips/MPS?
#173 opened 2 years ago by fakerybakery
3
Question about COMPLETION_MAX_PROMPT
#158 opened 2 years ago by nicpopovic
5
Any chance to support text-generation-inference as backend?
#147 opened 2 years ago by hewr1993
1
The requested URL was not found on the server
#151 opened 2 years ago by artivis
2
Instructions unclear
#137 opened 2 years ago by Anonym0us33
1
CORS headers
#143 opened 2 years ago by josephrocca
7
Replace EventSource with POST requests in playground
#145 opened 2 years ago by fardeon
0
Add a chat interface to playground
#142 opened 2 years ago by peakji
0
Slow Streaming
#99 opened 2 years ago by manojpreveen
2