Issues
- 0
- 0
ValueError: Must flatten tensors with uniform dtype but got torch.bfloat16 and torch.float32
#61 opened by daje0601 - 0
OOM error in FSDP QLORA setup
#60 opened by ss8319 - 0
- 7
- 0
Quantization question:
#56 opened by aptum11 - 0
Deprecation warnings.
#52 opened by hohoCode - 0
- 3
- 0
What's the use of "messages" in dpo step?
#48 opened by katopz - 0
question about DeepSpeedPeftCallback
#47 opened by mickeysun0104 - 1
Re. fine-tune-llms-in-2024-with-trl.ipynb
#45 opened by andysingal - 6
- 4
- 1
Instruction tuning of LLama2 is significantly slower compared to documented 3 hours fine-tuning time on A10G.
#35 opened by mlscientist2 - 2
flash attention error on instruction tune llama-2 tutorial on Sagemaker notebook
#40 opened by matthewchung74 - 10
CUDA OOM error while saving the model
#16 opened by aasthavar - 4
Precision Issue
#39 opened by zihaohe123 - 11
Does this work for Llama2 - Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA & Flash Attention?
#37 opened by ibicdev - 0
- 1
Compute metrics while using SFT trainer
#34 opened by shubhamagarwal92 - 1
Cannot load tokenizer for llama2
#33 opened by smreddy05 - 6
- 8
question about the llama instruction code
#28 opened by yeontaek - 0
- 1
compute_metrics() function
#3 opened by ybagoury - 1
- 2
- 1
gcc/cuda used for training
#24 opened by danyaljj - 3
- 1
Colab notebook fails
#17 opened by TzurV - 6
Error when training peft model example
#18 opened by Tachyon5 - 2
FLAN-T5 XXL using DeepSpeed fits well for training but gives OOM error for inference.
#12 opened by irshadbhat - 5
Inference on CNN validation set takes 2+ hours on p4dn.24xlarge machine with 8 A100s, 40GB each
#13 opened by sverneka - 4
ValueError
#14 opened by Martok10 - 7
- 1
Error when finetuning Flan-T5-XXL on custom dataset
#10 opened by ngun7 - 3
- 4
OOM when finetuning FLANT5-xxl
#6 opened by AndrewZhe - 2
Chat Inference Code
#4 opened by samarthsarin - 4