xorbitsai/inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

PythonApache-2.0

Issues

OpenGVLab/InternVL2_5-78B 架构变了，导致不能注册至xinf 希望能集成InternVL2_5
#2683 opened 5 days ago by Kevin-qwx
0
报错Segmentation fault (core dumped)
#2682 opened 5 days ago by SKKKKYLAR
0
xinference inference qwen-coder-instruct supports tools
#2675 opened 5 days ago by JasonFlyBeauty
11
GOT-OCR2.0模型运行出现问题
#2679 opened 5 days ago by zjx140
1
Failed to import from vllm._C with ImportError('libcuda.so.1: cannot open shared object file: No such file or directory') 或者 Failed to import from vllm._C with ImportError('libcuda.so.1: cannot open shared object file: No such file or directory')
#2604 opened 23 days ago by Weishaoya
5
Fake load balancing?
#2612 opened 5 days ago by ck7colin
15
Tools调用输出格式变化
#2616 opened 5 days ago by alvinlee518
3
模型如何更新？
#2624 opened 5 days ago by YSblack
3
能否添加类似ollama的模型随调随启和模型闲置老化的功能？
#2627 opened 5 days ago by a908569749
2
报错 : 'NoneType' object is not iterable
#2643 opened 13 days ago by hsoftxl
2
请求增加优先级参数，优先调度高优先级的请求
#2669 opened 9 days ago by 781574155
4
xinference模型每次重启完可以正常回答，过一段时间对话一直转圈
#2674 opened 6 days ago by jiusi9
2
对于系统上的一些参数，文档上描述不是很全面，麻烦解答一下
#2676 opened 6 days ago by nopipifish
1
无法停止使用前端webui
#2677 opened 6 days ago by nopipifish
1
Xinference v1.0.1，无法使用guided_json
#2636 opened 6 days ago by 780966854
2
Supporting Cline requests style.
#2659 opened 10 days ago by gahoo
9
通过 langchain 调用Xinference 大模型时报错
#2621 opened 6 days ago by winter-JX
4
ffmpeg was not found but is required to load audio files from filename
#2652 opened 12 days ago by zjxxsr
2
ValueError: Rerank model *** not found in Cluster mode
#2605 opened 8 days ago by congge27
3
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root
#2614 opened 8 days ago by yyxg
2
期待集成InternVL2.5-78B
#2662 opened 9 days ago by watch-Ultra
1
平台对接stable-diffusion-3.5-large 建议
#2664 opened 9 days ago by zhangjianquan
1
qwen2.5-7B-instruct-AWQ 推理报错 __pydantic_core_schema__
#2666 opened 9 days ago by hqm19
1
Xinference 基于vLLM并不支持多卡推理
#2651 opened 12 days ago by LRCgtp
2
当上下文过长时会出现model not found报错在模型名称自动添加后缀“-1-0”
#2653 opened 11 days ago by zmalqp189
3
建议增加对多模态嵌入模型 jinaai/jina-clip-v2 的支持
#2628 opened 10 days ago by sevenold
6
集群模式下注册自定义模型启动时会提示not found的问题
#2645 opened 12 days ago by DawnOf1996
1
launch失败
#2633 opened 16 days ago by Readyou123
1
xinference显存溢出后llm还在运行状态但api不可用
#2648 opened 12 days ago by YiLin198
1
whisper的推理api报错
#2650 opened 12 days ago by closer-finger
1
无网络环境下启动已缓存的 SenseVoiceSmall 会报 fsmn-vad is not registered
#2629 opened 10 days ago by kimi360
10
大模型（vllm引擎推理）运行时间久了显存会增加且无法下降
#2639 opened 13 days ago by turndown
2
When Dify is connected to CosyVoice-300M-SFT of Xinference for text-to-speech conversion, an error is reported.
#2606 opened 23 days ago by peterliang5678
2
How to set up xinference behind proxy-server ?
#2608 opened 10 days ago by wan2355
2
The embedding results for the same input, whether provided as a list or a string, should be identical.
#2610 opened 10 days ago by xiyuan-lee
2
Model Engine 使用 Transformers 参数max_tokens 无效
#2601 opened 24 days ago by alvinlee518
3
支持在对话补全请求中控制是否返回特殊标记
#2656 opened 11 days ago by zjuyzj
3
建议增加xgrammar结构化输出支持
#2620 opened 20 days ago by ZanePoe
4
在版本1.0.1中，无法运行qwen2_vl大模型，报告xinference.core.worker Leave launch_builtin_model, error: [address=0.0.0.0:42695, pid=88] cannot import name 'Qwen2VLForConditionalGeneration' from 'transformers' (/opt/conda/lib/python3.11/site-packages/transformers/__init__.py),
#2642 opened 13 days ago by zhudemiao
8
sglang 已经支持了qwen2vl，但xinference还是只能选择vllm和transformers进行qwen2vl的推理
#2619 opened 20 days ago by ZanePoe
3
Bug: QwQ-32B-preview pulling from a wrong hub
#2646 opened 12 days ago by redreamality
1
在系统上部署bge-reranker-v2-m3一类的rerank模型，有的时候会报错server close
#2644 opened 13 days ago by nopipifish
3
没有有用信息，错误栈也没有贴。
#2630 opened 17 days ago by Joker-sad
0
Qwen2___5-72B-Instruct-GPTQ-Int8 用vllm启动失败
#2634 opened 13 days ago by Weishaoya
4
Qwen2.5-72b-instruct-awq
#2631 opened 16 days ago by lmolhw5252
2
Server error: 500 - [address=0.0.0.0:41823, pid=1782322] __init__() missing 2 required positional arguments: 'supervisor_address' and 'replica_model_uid'
#2622 opened 18 days ago by Joker-sad
1
Vllm inference with a Lora module on top of Qwen2.5-7B-Instruct prompts error.
#2609 opened 19 days ago by wingtch
2
接入SenseVoice模型时发生权限错误
#2618 opened 19 days ago by terryoyty
1
文生图FLUX.1-dev多副本运行，但是调用接口生成图片依然是队列生成
#2617 opened 20 days ago by turndown
19
openai 访问 xinference got-ocr2模型 api接口出现openai.InternalServerError: Error code: 500 - {'detail': "[address=0.0.0.0:43157, pid=78] 'GotOCR2Model' object has no attribute 'model_spec'"}
#2607 opened 23 days ago by pandaominggz
2