xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
PythonApache-2.0
Issues
- 0
- 0
报错Segmentation fault (core dumped)
#2682 opened by SKKKKYLAR - 11
- 1
GOT-OCR2.0模型运行出现问题
#2679 opened by zjx140 - 5
Failed to import from vllm._C with ImportError('libcuda.so.1: cannot open shared object file: No such file or directory') 或者 Failed to import from vllm._C with ImportError('libcuda.so.1: cannot open shared object file: No such file or directory')
#2604 opened by Weishaoya - 15
Fake load balancing?
#2612 opened by ck7colin - 3
Tools调用输出格式变化
#2616 opened by alvinlee518 - 3
- 2
能否添加类似ollama的模型随调随启和模型闲置老化的功能?
#2627 opened by a908569749 - 2
报错 : 'NoneType' object is not iterable
#2643 opened by hsoftxl - 4
请求增加优先级参数,优先调度高优先级的请求
#2669 opened by 781574155 - 2
xinference模型每次重启完可以正常回答,过一段时间对话一直转圈
#2674 opened by jiusi9 - 1
对于系统上的一些参数,文档上描述不是很全面,麻烦解答一下
#2676 opened by nopipifish - 1
无法停止使用前端webui
#2677 opened by nopipifish - 2
Xinference v1.0.1,无法使用guided_json
#2636 opened by 780966854 - 9
Supporting Cline requests style.
#2659 opened by gahoo - 4
通过 langchain 调用Xinference 大模型时报错
#2621 opened by winter-JX - 2
- 3
ValueError: Rerank model *** not found in Cluster mode
#2605 opened by congge27 - 2
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root
#2614 opened by yyxg - 1
期待集成InternVL2.5-78B
#2662 opened by watch-Ultra - 1
平台对接stable-diffusion-3.5-large 建议
#2664 opened by zhangjianquan - 1
qwen2.5-7B-instruct-AWQ 推理报错 __pydantic_core_schema__
#2666 opened by hqm19 - 2
Xinference 基于vLLM并不支持多卡推理
#2651 opened by LRCgtp - 3
当上下文过长时会出现model not found报错在模型名称自动添加后缀“-1-0”
#2653 opened by zmalqp189 - 6
建议增加对 多模态嵌入模型 jinaai/jina-clip-v2 的支持
#2628 opened by sevenold - 1
集群模式下注册自定义模型启动时会提示not found的问题
#2645 opened by DawnOf1996 - 1
launch失败
#2633 opened by Readyou123 - 1
xinference显存溢出后llm还在运行状态但api不可用
#2648 opened by YiLin198 - 1
whisper的推理api报错
#2650 opened by closer-finger - 10
- 2
大模型(vllm引擎推理)运行时间久了显存会增加且无法下降
#2639 opened by turndown - 2
When Dify is connected to CosyVoice-300M-SFT of Xinference for text-to-speech conversion, an error is reported.
#2606 opened by peterliang5678 - 2
How to set up xinference behind proxy-server ?
#2608 opened by wan2355 - 2
The embedding results for the same input, whether provided as a list or a string, should be identical.
#2610 opened by xiyuan-lee - 3
Model Engine 使用 Transformers 参数max_tokens 无效
#2601 opened by alvinlee518 - 3
支持在对话补全请求中控制是否返回特殊标记
#2656 opened by zjuyzj - 4
建议增加xgrammar结构化输出支持
#2620 opened by ZanePoe - 8
在版本1.0.1中,无法运行qwen2_vl大模型,报告xinference.core.worker Leave launch_builtin_model, error: [address=0.0.0.0:42695, pid=88] cannot import name 'Qwen2VLForConditionalGeneration' from 'transformers' (/opt/conda/lib/python3.11/site-packages/transformers/__init__.py),
#2642 opened by zhudemiao - 3
- 1
Bug: QwQ-32B-preview pulling from a wrong hub
#2646 opened by redreamality - 3
- 0
没有有用信息,错误栈也没有贴。
#2630 opened by Joker-sad - 4
Qwen2___5-72B-Instruct-GPTQ-Int8 用vllm启动失败
#2634 opened by Weishaoya - 2
Qwen2.5-72b-instruct-awq
#2631 opened by lmolhw5252 - 1
Server error: 500 - [address=0.0.0.0:41823, pid=1782322] __init__() missing 2 required positional arguments: 'supervisor_address' and 'replica_model_uid'
#2622 opened by Joker-sad - 2
Vllm inference with a Lora module on top of Qwen2.5-7B-Instruct prompts error.
#2609 opened by wingtch - 1
接入SenseVoice模型时发生权限错误
#2618 opened by terryoyty - 19
文生图FLUX.1-dev多副本运行,但是调用接口生成图片依然是队列生成
#2617 opened by turndown - 2
openai 访问 xinference got-ocr2模型 api接口出现openai.InternalServerError: Error code: 500 - {'detail': "[address=0.0.0.0:43157, pid=78] 'GotOCR2Model' object has no attribute 'model_spec'"}
#2607 opened by pandaominggz