xusenlinzy/api-for-open-llm
Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口
PythonApache-2.0
Issues
- 2
- 2
使用 streamer_v2 会造成乱码
#287 opened by Tendo33 - 0
glm4 接入dify后无法触发使用工具
#288 opened by he498 - 1
执行SQL chat时候报ProgrammingError错误
#277 opened by songyao199681 - 1
"POST /v1/files HTTP/1.1" 404 Not Found
#286 opened by KEAI404 - 6
Qwen1.5推理报错RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
#241 opened by syusama - 2
我想使用的模型不在模型支持列表,是否说明无法使用此项目生成openai的接口
#258 opened by xiaoma444 - 4
使用最新的 vllm 镜像推理qwen2-72B-AWQ 报错
#285 opened by Tendo33 - 1
docker无法下载image
#284 opened by xqinshan - 2
- 1
💡 [REQUEST] - <请问可以支持一下Xai新发布的Grok-1模型吗>
#248 opened by Hapluckyy - 0
- 9
接口请求报错:TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]
#282 opened by syusama - 0
InternLM 20B 胡言乱语,什么原因?
#244 opened by jaffe-fly - 12
dcoker 部署 vllm 出现 404 Not Found
#271 opened by skyliwq - 14
- 2
vllm模式推理报错
#279 opened by yeehua-cn - 0
无法运行instruction.py
#280 opened by NCCurry30 - 5
EMBEDDING_API_BASE获取不到str expected, not NoneType
#270 opened by chukangkang - 2
使用baichuan2-13b-chat模型,回答的乱码,代码写不出来
#261 opened by guiniao - 1
- 3
我现在部署了很多模型,有没有一个webui 界面让我来统一调用部署的模型进行推理
#278 opened by Tendo33 - 4
vllm本地部署时,vllm engine启动失败
#274 opened by Ruibn - 0
什么时候能修复 Qwen 1.5 call function功能了。
#273 opened by skyliwq - 2
找不到图标
#265 opened by lucheng07082221 - 4
Qwen1.5不支持tool_choice
#245 opened by YunmengLiu0 - 10
使用 vllm 推理,在文本末尾缺失信息
#236 opened by parasol-ry - 0
💡 vllm已经支持流水线并行啦(pipeline parallel),可以极大增加吞吐量,作者可否增加一下vllm的pipeline parallel支持
#269 opened by CaptainLeezz - 0
lifespan not work, cache not cleared
#260 opened by Yimi81 - 0
💡 [REQUEST] - 请求支持CodeLlama-70b-Instruct-hf
#255 opened by Reset816 - 1
vllm 容器依赖报错
#268 opened by Tendo33 - 2
llama3提问后回答不停止
#266 opened by gptcod - 3
运行internlm2时报错找不到权重文件,这些文件模型并不提供
#263 opened by 760485464 - 1
- 10
34b 模型 ,用int4 ,vllm进行推理,单张24G 4090GPU,显示显存不足
#239 opened by haohuisss - 2
Qwen1.5-7B-Chat 使用API呼叫 completions 功能未能生成下文
#235 opened by kanslor - 1
关于 api/config.py 中 SETTINGS = Settings()的 bug
#262 opened by Tendo33 - 3
MODEL_NAME=qwen2的情况下functions无效
#257 opened by liuyi1213812 - 1
ValueError: The model's max seq len (32768) is larger than the maximum number of tokens that can be stored in KV cache (15248). Try increasing `gpu_memory_utilization` or decreasing `max_model_len` when initializing the engine.
#259 opened by guiniao - 0
TypeError: 'NoneType' object is not subscriptable
#243 opened by deauss2017 - 0
你好,支持mistral模型吗
#246 opened by lucheng07082221 - 1
INFO: 172.20.0.8:60822 - "POST /v1/chat/completions HTTP/1.1" 422 Unprocessable Entity💡 [REQUEST] - <title>
#256 opened by besthong999 - 2
- 2
用langchain调用bge时报错
#252 opened by Qoooooooooooo - 2
[Bug] vLLM 镜像中Pydantic版本冲突
#251 opened by liuyanyi - 1
Traceback (most recent call last): File "/Users/yaoxingzhi/Desktop/api-for-open-llm-master/api/server.py", line 1, in <module> from api.config import SETTINGS ModuleNotFoundError: No module named 'api'
#254 opened by 779257747 - 1
vllm模式启动报错:ImportError: cannot import name 'model_validator' from 'pydantic' (/usr/local/lib/python3.10/dist-packages/pydantic/__init__.cpython-310-x86_64-linux-gnu.so)
#250 opened by syusama - 2
docker打包报错:ERROR: failed to solve: process "/bin/sh -c pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple && pip install bitsandbytes --upgrade && pip install vllm==0.3.3 && pip install --no-cache-dir -r /workspace/requirements.txt && pip uninstall transformer-engine -y" did not complete successfully: exit code: 1
#249 opened by syusama - 1
- 1
Qwen1.5 提示模板缺少默认 system message
#240 opened by liuyanyi