BlackSamorez/tensor_parallel
Automatically split your PyTorch models on multiple GPUs for training & inference
PythonMIT
Pinned issues
Issues
- 6
- 2
Compatibility with `transformers > 4.36`: error: `AttributeError: 'tuple' object has no attribute 'to_legacy_cache'`
#137 opened by Dr-Left - 0
Customized generate func support?
#136 opened by MonolithFoundation - 1
RuntimeError: NCCL Error 3: internal error
#121 opened by smallmocha - 26
- 0
- 0
- 0
- 0
- 0
- 1
No output when using tensor_parallel
#128 opened by yyya9 - 0
How to use the model in a scenario where it is stored in the Safetenors format?
#127 opened by yxk9810 - 1
Out of GPU memory for two A10 GPUs
#126 opened by JunyiYe - 0
AttributeError: object has no attribute 'devices'
#125 opened by QiueY514 - 0
ValueError: Model parameters were moved to incorrect devices, did call on model.cuda() or model.to(device)? If so, please avoid doing that
#124 opened by Khyat - 2
Max Recursion Error when using with lora
#122 opened by Ar-Kareem - 1
Can I parallelize just one large layer?
#83 opened by chinmayjog13 - 0
Segmentation fault (core dumped)
#120 opened by jameswu2014 - 1
Support of 8-bit and 4-bit quantization
#119 opened by ludwigflo - 0
- 0
2x slowdown using TP
#117 opened by jph00 - 0
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
#116 opened by SparkJiao - 5
- 18
why raised cuda error?
#95 opened by YooSungHyun - 2
tensor_parallel method distributed=True
#114 opened by Johnno1011 - 3
model.generate() with inputs_embeds
#112 opened by ZhaoxuanWu - 0
Error loading LLAMA model config
#107 opened by tonywang16 - 6
Issues if GPU > 2
#98 opened by Tom-Ryder - 1
GPT-2 broken starting in v1.2.5
#99 opened by eric-mitchell - 1
- 0
- 4
Cloud Tensor_parallel add multiple accelerator inference support with torch.distributed?
#97 opened by hijeffwu - 2
- 2
Possibility to run on different GPUs
#94 opened by Ch4mpa9ne - 6
Support for PEFT LoRA and 4-bit quantization
#80 opened by morecry - 1
- 23
Question on custom models
#88 opened by vince62s - 6
Not work with 4bit quant
#79 opened by laoda513 - 1
- 2
Does tensor_parallel support the model inference concurrently or in multi-threads?
#86 opened by zoubaihan - 0
Does tensor_parallel support data parallel and tensor parallel hybrid training?
#85 opened by liguodongiot - 5
- 4
Torch version requirement
#76 opened by treya-lin - 0
- 1
Huggingface Accelerate
#74 opened by conceptofmind - 1
- 6
- 2
- 13
How to load lora weights?
#67 opened by Vincent131499 - 7