S-LoRA/S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

PythonApache-2.0

Issues

Any plans for a llama 3 version?
#42 opened 25 days ago by AWAS666
0
any performance testing about S-lora ?
#41 opened 2 months ago by x-transformers
0
Can I get a citation
#40 opened 3 months ago by sabetAI
0
Failed to run tp branch
#39 opened 3 months ago by sleepwalker2017
0
Any advice for debugging this project?
#38 opened 3 months ago by sleepwalker2017
0
Get stuck after running benchmark client
#37 opened 3 months ago by sleepwalker2017
0
Train using huggingface PEFT & Serve using S-Lora
#36 opened 4 months ago by hchoi-moveworks
0
ModuleNotFoundError: No module named 'slora._kernels'
#19 opened 6 months ago by yaoshao123
3
Any performance of llama2-series model?
#35 opened 4 months ago by skykiseki
0
Workaround with GPT2
#34 opened 4 months ago by jannikbuscha
0
Support qwen?
#33 opened 5 months ago by yinjiaoyuan
0
When will the Qianwen model be supported?
#26 opened 6 months ago by takemars
1
OpenAI API webserver compatible
#9 opened 6 months ago by giaosudau
0
Mistral
#3 opened 7 months ago by bacoco
3
Multi-GPU Support
#24 opened 6 months ago by luciferlinx101
1
Query multiple LoRA by weights
#31 opened 5 months ago by authurlord
0
Does it support V100 GPU?
#25 opened 6 months ago by Ted8000
2
not support baichuan
#29 opened 5 months ago by codernew007
0
Install failed : cuda/pipeline: No such file or directory
#28 opened 5 months ago by oushu1zhangxiangxuan1
0
Tensor parallelism with S-LoRA
#27 opened 5 months ago by debraj135
1
What's the difference from LoRAX
#22 opened 6 months ago by wDevil
2
Encountered some problems when adding the support for GPT-Q 4-bit quantized LLaMA-2 model.
#18 opened 6 months ago by suilin0432
2
ISSUE
#23 opened 6 months ago by sailakkshmiallada
1
Question about device bandwidth
#21 opened 6 months ago by qizzzh
0
AttributeError: 'LoraLayerWeight' object has no attribute 'k_lora_A_home'
#15 opened 6 months ago by raihan0824
2
This is huge!
#2 opened 6 months ago by yhyu13
5
Support chatglm3 ?
#20 opened 6 months ago by litetoooooom
0
Choosing adapters on inference
#12 opened 6 months ago by raihan0824
2
can it support rtx 4090 (24 gb) ?
#11 opened 6 months ago by jaiabhayk
1
Question about cuda kernel
#10 opened 6 months ago by harryhan618
2
Encoder-Decoder model support
#17 opened 6 months ago by aravindMahadevan
0
Support for GPT-NEOX models
#16 opened 6 months ago by bibekyess
0
how the finetune works, does it support peft lora?
#14 opened 6 months ago by whybeyoung
0
torch 2.0.1 requires triton==2.0.0
#8 opened 6 months ago by giaosudau
1
Potential issue related to Python's early-bound default parameters
#7 opened 6 months ago by bugface
0
Quantisation
#6 opened 7 months ago by nivibilla
0
Add clickable link to corresponding Arxiv article
#5 opened 7 months ago by impredicative
1