meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Jupyter NotebookNOASSERTION
Issues
- 4
Llama 3.1 Code Interpreter file reference
#610 opened by jonatananselmo - 9
- 2
DeepSpeed support for Full Finetuning - FSDP performance is not as good as Deepspeed
#536 opened by waterluck - 6
FSDP finetuned model inference question
#634 opened by mathmax12 - 7
FLOPs counter seems doesn't work
#642 opened by mathmax12 - 1
upgrade typing_extensions version
#645 opened by lxning - 5
Prompting for finetuning: Should <|eot_id|> be eos token or <|end_of_text|> be eos token? Model seems to be foregetting when to stop after finetuning.
#640 opened by sammed-kamboj - 0
Evaluation details for MultiPL-E
#644 opened by devanshrj - 1
- 3
- 7
can't reproduce llama3.1 evaluation results
#613 opened by Juhywcy - 3
Prompt template for agents with multiple tools and states + Positioning of info in prompts
#561 opened by DevMandal-Sarvam - 1
Warning: Asking to pad to max_length but no maximum length is provided and the model has no predefined maximum length.
#539 opened by artkpv - 2
- 3
no distributed view in tensorboard
#533 opened by ltm920716 - 3
Android example with MLC-LLM can't build with mlc-llm nightly package on MacOS x86-64
#534 opened by WuhanMonkey - 2
Mismatch in Llama Guard 2 prompt between llama.meta.com and code - `begin_text`
#617 opened by BrunoGomesCoelho - 3
- 0
Why set the label tokens the same as the input token
#637 opened by kaimoxuan123 - 5
General question about difference between finetuning on Huggingface's trainer and using llama-recipes finetune script
#517 opened by Tizzzzy - 1
Success in one command, failed using other command
#518 opened by Tizzzzy - 1
Token indices sequence length is longer than the specified maximum sequence length for this model
#537 opened by biaoyanf - 2
Add in the use of iterable datasets when fine tuning
#565 opened by BaiqingL - 2
- 1
- 2
- 4
Getting "Killed" when trying to finetune the model
#511 opened by Tizzzzy - 4
How to test my finetuned model
#535 opened by Tizzzzy - 1
- 1
Recommendations to save, store & re-use results?
#598 opened by smach - 1
AnT1nG-Meta-llama
#611 opened by ReyBan82 - 4
- 2
- 0
Herbal medicine
#624 opened by Sholman81 - 6
is LengthBasedBatchSampler used for make similar length sentence into one batch?
#620 opened by hjc3613 - 1
- 1
- 0
- 11
Could not finetune llama 3 on multiple GPUs
#556 opened by 1155157110 - 2
Some NCCL operations have failed or timed out.
#543 opened by lygjwy - 0
Tracking Issue for repo refactor
#579 opened by subramen - 3
What's the motivation of sorting dataset by length?
#574 opened by Ber666 - 2
find bugs in langgraph-rag-agent.ipynb
#571 opened by jarvisDang - 1
> Mám malé bitcoiny a mohu zkontrolovat etherscan, ale jak to, že všechny mé peněženky jsou stále prázdné? co mám dělat?
#566 opened by Karliz24 - 1
Allow custom datasets to resize token embeddings
#564 opened by BaiqingL - 3
LlamaForCausalLM.from_pretrained: "Only Tensors of floating point and complex dtype can require gradients", on FSDP, Accelerate, quatization
#548 opened by artkpv - 1
model name llama2-70b-4096 defined in Getting_to_know_Llama.ipynb doesn't exist in groq anymore
#541 opened by jarvisDang - 8
run finetuning.py error: TypeError: Invalid function argument. Expected parameter `tensor` of type torch.Tensor, but got <class 'float'> instead.
#520 opened by winca - 2
Finetuning general LLM models from hugging face
#521 opened by bkhanal-11 - 0