ModelTC/lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

PythonApache-2.0

Issues

我看到列表里支持qwen7B，请问是否支持qwen1.5-14B呢？
#422 opened 2 months ago by koalaaaaaaaaa
1
[Question] Support for Mistral?
#415 opened 2 months ago by BaiMoHan
1
[BUG]Ask aboout Qwen models with weight quantization .
#408 opened 2 months ago by Cesilina
2
[BUG] Slow Tokenizer Message is printing when the Fast Tokenizer may be in use
#407 opened 3 months ago by david-vectorflow
1
Qwen-14B-INT8 face the issue: 'QwenTransformerLayerWeight' object has no attribute 'q_weight_'
#333 opened 5 months ago by wangr0031
1
[Question]: supported platforms and supported hardwares
#406 opened 3 months ago by xinchengxx
1
support qwen1.5?
#389 opened 3 months ago by xxm1668
2
如何支持模型输出一个特定的output的困惑度
#401 opened 3 months ago by harvinyou
3
[Question] How does lightllm implement nopad batching?
#405 opened 3 months ago by Tomorrowdawn
0
请问是否支持llama3架构
#402 opened 3 months ago by harvinyou
1
请问是否有计划支持MiniCPM-V-2
#404 opened 3 months ago by xiabo0816
1
请问是否有计划支持MiniCPM-V-2
#403 opened 3 months ago by xiabo0816
0
如何支持流式回答，并支持输出流式结果的logprobs（output整体+每一个output 的token的logprob）
#400 opened 3 months ago by harvinyou
1
我希望获得原始的logits。应该是使用哪个api
#398 opened 3 months ago by harvinyou
7
请问现在支持Yi-34B的awq 4bit部署吗？
#291 opened 6 months ago by xyfZzz
5
请问有计划支持其他加速卡（非英伟达）吗？
#399 opened 3 months ago by huangfude
1
如何多卡启动推理模型
#393 opened 3 months ago by harvinyou
1
flash_llm_fp6_llm
#383 opened 4 months ago by wm901115nwpu
1
`grid` in `context_attention_fwd_no_prompt_cache`
#381 opened 4 months ago by liyucheng09
1
[BUG]The inference result includes extra prompt content(推理结果会出现额外的prompt内容Human:xxxxx, \n Assistant:xxxxx)
#372 opened 4 months ago by SleepyHollowforesthills
2
[Ask] PageAttention 和 TokenAttention 的对比
#379 opened 4 months ago by zzb610
1
[BUG] There already is a lightllm in pypi
#380 opened 4 months ago by rlippmann
0
并发请求报错
#368 opened 4 months ago by diyage
1
[BUG]Error in llama/triton_kernel/silu_and_mul.py/test_silu_and_mul function due to in-place modification of parameters and Triton kernel error in version 2.0.0
#357 opened 4 months ago by mivenis
1
weight only int4 is slower than cutlass int4
#362 opened 4 months ago by zhoutianzi666
1
[BUG] failed to serve a Qwen1.5-72B-chat model
#350 opened 4 months ago by pluiez
2
Are there any efficient way to command kill the lightllm process?
#343 opened 5 months ago by yy9996
3
[BUG]Benchmark有一些问题
#329 opened 5 months ago by Storm0921
1
[BUG] Support for DeepSeek?
#325 opened 6 months ago by suhjohn
1
InternLM2-20B不支持
#327 opened 6 months ago by Storm0921
5
[BUG] stop_words
#326 opened 6 months ago by baisechundu
0
int4_kernel
#324 opened 6 months ago by Cydia2018
1
[Feature]请帮忙提供load_from_weight_dict(weight_dict)接口。
#277 opened 7 months ago by bingo787
10
[BUG] Baichuan13B model init failed
#323 opened 6 months ago by bingo787
7
llama rope sin cos的形状
#319 opened 6 months ago by feifeibear
1
Inconsistent Output between LightLLM and Transformers Inference Library
#309 opened 6 months ago by Lvjinhong
2
是否能支持sqlcoder系列模型
#310 opened 6 months ago by 2496289471
2
请问lightllm可以离线推理吗，有没有参考代码
#308 opened 6 months ago by monkeyZhy
1
No module named petrel_client
#298 opened 6 months ago by Lvjinhong
5
[BUG]Qwen72b QWenTpPartModel object has no attribute max_ntk_alpha
#302 opened 6 months ago by ChristineSeven
2
What is the plan to support beam search
#286 opened 7 months ago by feifeibear
4
LlamaTpPartModel如何使用
#287 opened 6 months ago by feifeibear
4
[BUG]Baichuan2-13B-chat api_server 结果看起来是续写，像是base模型的结果，而不是chat
#279 opened 7 months ago by ericg108
5
请问支持多机推理吗？
#271 opened 7 months ago by zbtrs
1
Is there any comparison of the effects related to token attention? For example, compare with page attention
#268 opened 7 months ago by skykiseki
2
How can we load peft adaper with lightllm?
#251 opened 7 months ago by oushu1zhangxiangxuan1
1
Custom template
#245 opened 7 months ago by bino282
1
Is ChatGLM3-6b supported yet?
#246 opened 7 months ago by Jeru2023
1
[BUG] Baihcuan2-13B 输入token长度1024左右返回单个终止符
#244 opened 7 months ago by HJT9328
3
和vLLM的temperature参数对齐问题
#241 opened 8 months ago by ArachisTong
3