xusenlinzy/api-for-open-llm

Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口

PythonApache-2.0

Issues

4*4090 显卡部署glm4-9b 使用dify 的api调用报错
#315 opened 2 months ago by he498
1
TASKS=llm,rag模式下，出现线程问题报错：RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
#308 opened 4 months ago by syusama
3
框架vllm输出截断，但是官方vllm启动和transformers运行模型都不
#314 opened 3 months ago by TLL1213
2
执行streamlit_app.py报错
#310 opened 3 months ago by louan1998
0
使用Qwen2-7B-Instrut模型出现问题-使用Vllm
#303 opened 4 months ago by Empress7211
3
glm4 接入dify后无法触发使用工具
#288 opened 6 months ago by he498
1
运行glm4v请求报错
#311 opened 4 months ago by 760485464
1
not support sglang backend
#309 opened 4 months ago by colinsongf
0
docker运行报错：multiproc_worker_utils.py:226] RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
#305 opened 4 months ago by syusama
3
部署gte-qwen2-1.5b-instruct请求rerank接口报错
#307 opened 4 months ago by cowcomic
0
glm-4v启动正常访问推理报错
#291 opened 6 months ago by 760485464
10
💡 [REQUEST] - 请问可以支持**电信大模型Telechat吗？流程可以跑通，但是回复content会被截断
#301 opened 5 months ago by Song345381185
9
python: can't open file '/workspace/api/server.py': [Errno 2] No such file or directory，Ubuntu上docker-compose部署Qwen2-72B-Instruct-GPTQ-Int4报错
#304 opened 4 months ago by syusama
0
llama3-8B回答后自我交流，不停止
#296 opened 6 months ago by yd9038074
1
minicpm启动没问题，推理访问报错
#292 opened 6 months ago by 760485464
2
RuntimeError: CUDA error: device-side assert triggered
#302 opened 5 months ago by ChaoPeng13
0
【embedding】是不支持最新的SOTA模型吗？KeyError: 'Could not automatically map text2vec-base-multilingual to a tokeniser.
#297 opened 5 months ago by ForgetThatNight
2
💡 [REQUEST] - 请问可以支持**电信大模型Telechat吗？流程可以跑通，但是回复content会被截断
#300 opened 5 months ago by Song345381185
0
doc chat 使用时报 FileNotFoundError: Table does not exist.Please first call db.create_table(, data) 错误
#299 opened 5 months ago by Weiqiang-Li
1
dify调用chatglm4-chat接口报错500（TypeError: object of type 'int' has no len()）
#295 opened 6 months ago by besthong999
5
qwen2推理报错
#293 opened 6 months ago by wj1017090777
10
使用api-for-open-llm&vllm多卡部署运行Qwen2-7B时报错显存占满
#290 opened 6 months ago by Woiea
5
执行SQL chat时候报ProgrammingError错误
#277 opened 6 months ago by songyao199681
2
使用 streamer_v2 会造成乱码
#287 opened 6 months ago by Tendo33
2
"POST /v1/files HTTP/1.1" 404 Not Found
#286 opened 6 months ago by KEAI404
1
我想使用的模型不在模型支持列表，是否说明无法使用此项目生成openai的接口
#258 opened 6 months ago by xiaoma444
2
使用最新的 vllm 镜像推理qwen2-72B-AWQ 报错
#285 opened 6 months ago by Tendo33
4
docker无法下载image
#284 opened 6 months ago by xqinshan
1
dcoker部署embedding接口报错："POST /v1/embeddings HTTP/1.1" 404 Not Found
#283 opened 7 months ago by syusama
2
接口请求报错：TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]
#282 opened 7 months ago by syusama
9
dcoker 部署 vllm 出现 404 Not Found
#271 opened 7 months ago by skyliwq
12
glm4用default和vllm方式部署，都不能正常的停止，流式和非流式都有同样的问题，不知道其他朋友是否遇到同样的问题
#281 opened 7 months ago by LiuDQ-wm
14
vllm模式推理报错
#279 opened 7 months ago by yeehua-cn
2
无法运行instruction.py
#280 opened 7 months ago by NCCurry30
0
EMBEDDING_API_BASE获取不到str expected, not NoneType
#270 opened 8 months ago by chukangkang
5
使用baichuan2-13b-chat模型，回答的乱码，代码写不出来
#261 opened 7 months ago by guiniao
2
如何启动接口调用大模型的流式输出 http://127.0.0.1:8080/v1/chat/completions
#272 opened 7 months ago by 469981325
1
我现在部署了很多模型，有没有一个webui 界面让我来统一调用部署的模型进行推理
#278 opened 7 months ago by Tendo33
3
vllm本地部署时，vllm engine启动失败
#274 opened 7 months ago by Ruibn
4
什么时候能修复 Qwen 1.5 call function功能了。
#273 opened 7 months ago by skyliwq
0
找不到图标
#265 opened 8 months ago by lucheng07082221
2
💡 vllm已经支持流水线并行啦（pipeline parallel），可以极大增加吞吐量，作者可否增加一下vllm的pipeline parallel支持
#269 opened 8 months ago by CaptainLeezz
0
lifespan not work, cache not cleared
#260 opened 8 months ago by Yimi81
0
vllm 容器依赖报错
#268 opened 8 months ago by Tendo33
1
llama3提问后回答不停止
#266 opened 8 months ago by gptcod
2
运行internlm2时报错找不到权重文件，这些文件模型并不提供
#263 opened 8 months ago by 760485464
3
升级后报错
#264 opened 8 months ago by darvsum
1
关于 api/config.py 中 SETTINGS = Settings()的 bug
#262 opened 8 months ago by Tendo33
1
MODEL_NAME=qwen2的情况下functions无效
#257 opened 8 months ago by liuyi1213812
3
ValueError: The model's max seq len (32768) is larger than the maximum number of tokens that can be stored in KV cache (15248). Try increasing `gpu_memory_utilization` or decreasing `max_model_len` when initializing the engine.
#259 opened 8 months ago by guiniao
1