InternLM/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

PythonApache-2.0

Pinned issues

[Benchmark] benchmarks on different cuda architecture with models of various size

#815 opened 6 months ago by lvhan028

Open6

报名参加书生·浦语大模型实战营——两周带你玩转微调部署评测全链路

#890 opened 6 months ago by vansin

Open0

Issues

[Feature Request] OpenAI-compatible `stop` param
#1731 opened 5 days ago by josephrocca
7
[Bug] Many concurrent requests with `--enable-prefix-caching` AND `--quant-policy 8` crashes with: `CUDA runtime error: an illegal memory access was encountered /opt/lmdeploy/src/turbomind/utils/allocator.h:231`
#1744 opened 4 days ago by josephrocca
5
logits输出有问题[Bug]
#1742 opened 4 days ago by GZL11
1
[Feature] Qwen 2 Support
#1746 opened 3 days ago by suptejas
2
[Bug] xcomposer 4khd lora weight error in lmdeploy
#1747 opened 3 days ago by ztfmars
1
[Bug] Space is incorrectly removed from start of generated text for `/v1/completion` endpoint
#1743 opened 4 days ago by josephrocca
0
[Feature] `min_p` sampling parameter
#1745 opened 4 days ago by josephrocca
0
[Bug] `detokenize_incrementally`: OverflowError: out of range integral type conversion attempted
#1739 opened 4 days ago by josephrocca
2
[Bug] AttributeError: 'InternVLChatConfig' object has no attribute 'hidden_size'
#1725 opened 5 days ago by DefTruth
20
[Docs] Guidance on setting `num_tokens_per_iter` and `max_prefill_iters` to optimal values
#1740 opened 4 days ago by josephrocca
1
[Feature] Speculative Decoding
#1738 opened 4 days ago by josephrocca
2
[Docs] Where is prefix cache data stored?
#1737 opened 4 days ago by josephrocca
3
[Bug] 量化模型时无输出
#1735 opened 5 days ago by NB-Group
4
[Feature] InternVL-Chat-V1-5-AWQ merge LoRA adapter
#1691 opened 13 days ago by isongxw
4
about InternVL−Chat−V1.5 8 bit quantization
#1727 opened 4 days ago by tairen99
2
[Bug] internlm2-chat-1_8b模型使用4bit KV量化的时候找不到key_stats.pth
#1720 opened 6 days ago by jxfruit
7
[Bug] Mini-InternVL1.5-4B does not suceessfully initialized.
#1721 opened 6 days ago by cydiachen
4
lmdeploy0.4.2 8卡推理llama7-70b-instruct无反应
#1712 opened 7 days ago by yak9meat
10
[Feature] lmdeploy通过命令行可以启动一个gradio应用，这个gradio的应用是不是可以给用户提供UI修改的方法？
#1710 opened 7 days ago by kaiwang0112006
3
[Bug] failed to set temperature 1.2
#1732 opened 5 days ago by zhyncs
1
[Bug] CUDA OOM during calibration even with 5x 4090s? Falling back to `--device cpu` also fails (with different error)
#1729 opened 5 days ago by josephrocca
2
[Bug] Why does prefix caching change the generated content
#1719 opened 7 days ago by DayDayupupupup
16
[Bug] 部署cogvlm2运行时，接受的多个并发之间存在干扰，后面的请求使用前面请求传的图像
#1730 opened 5 days ago by LRHstudy
0
[Feature] Support for THUDM/glm-4v-9b
#1726 opened 5 days ago by Iven2132
1
High GPU memory for running InternVL-Chat-V1-5-AWQ
#1728 opened 5 days ago by tairen99
1
[Feature] Create Cuda 12 docker images
#1709 opened 7 days ago by nickmitchko
4
[Bug] torch.cuda.OutOfMemoryError when loading the 4 Bit InternVL-Chat-V1-5 vision
#1704 opened 5 days ago by tairen99
7
[Bug] lmdeploy chat model_name 对话的时候，报Aborted (core dumped)
#1706 opened 8 days ago by jujunchen
5
[Feature] have plan to support MiniCPM-V?
#1723 opened 6 days ago by HaoLiuHust
1
How to trace multiple GPUs using nsight system
#1722 opened 6 days ago by sleepwalker2017
0
[Feature] Make torchvision optional
#1717 opened 6 days ago by zhyncs
2
batch inference
#1689 opened 13 days ago by dirtycomputer
1
[Bug] not support inference qwen1.5
#1697 opened 12 days ago by zzc0208
6
[Bug] ModuleNotFoundError: No module named '_turbomind' loading llava Mistral 7B
#1699 opened 11 days ago by Alexis-IMBERT
1
[Feature] V100量化推理
#1711 opened 7 days ago by QwertyJack
1
[Bug] RuntimeError: [TM][ERROR] Assertion fail: D:\a\lmdeploy\lmdeploy\src/turbomind/models/llama/Barrier.h:20
#1703 opened 9 days ago by NB-Group
12
[Feature] 想问下有打算支持GLM4V模型吗
#1713 opened 7 days ago by will-wiki
1
AWQ small batches optimization
#1707 opened 8 days ago by zhyncs
2
总是看到一个using default GEMM algo的WARNING，是否会因为使用了默认的GEMM而影响速度或者吞吐量？
#1680 opened 8 days ago by lzcchl
1
[Bug]
#1695 opened 11 days ago by xiaoajie738
2
[Bug] 使用LM启动API服务器的InternVL-1.5无法识别图片
#1701 opened 9 days ago by BigWhiteFox
2
How is the support for RoPE difference between `hf llama` and `meta llama`?
#1700 opened 9 days ago by sleepwalker2017
0
[Feature] Support for LLaVA-NeXT
#1685 opened 9 days ago by deece
1
Encountered core dump issue when quantifying the model
#1698 opened 11 days ago by zzc0208
2
[Bug] output diff when temperature set zero
#1688 opened 13 days ago by zhyncs
3
[Docs] How are multiple images handled?
#1686 opened 13 days ago by pseudotensor
5
InternVL模型在推理时是否可以控制max_tiles参数等或者是否可以pipline直接传入pixel_values
#1696 opened 12 days ago by YuMingtao0503
3
[Feature] support for MiniCPM-Llama3-V 2.5
#1693 opened 12 days ago by LRHstudy
1
[Feature] peft<=0.9.0 要求的版本要求太低，与较多环境要求peft>0.10冲突，能否修改
#1682 opened 13 days ago by OKC13
0
[Bug] 下载代码执行internvl-v1.5量化，导入本地模型时报错
#1681 opened 13 days ago by qingchunlizhi
11