Lightning-AI/litgpt
Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
PythonApache-2.0
Issues
- 0
Finetuning with multiple gpus extemely slow
#1472 opened by SergioG-M - 3
Command -> litgpt download openlm-research/open_llama_13b gives error: Unrecognized arguments: openlm-research/open_llama_13b
#1471 opened by VamsiYK - 3
- 0
Continue finetuning
#1464 opened by SergioG-M - 0
LR scheduler can result in a division by 0
#1393 opened by carmocca - 4
Finetune lora max_seq_length error
#1461 opened by SergioG-M - 1
Create new CI API key
#1433 opened by carmocca - 0
- 3
- 0
- 4
Using custom data for `Continue pretraining an LLM`
#1450 opened by SimiPixel - 2
validation output during finetuning
#1443 opened by richardzhuang0412 - 2
Mixtral 8x22B support
#1448 opened by SergioG-M - 2
mistralai/Mistral-7B-v0.3 support
#1444 opened by karkeranikitha - 0
- 1
Specify cache for huggingface openwebtext download
#1446 opened by srivassid - 5
How to set max_iters
#1445 opened by srivassid - 0
Upgrade LitData
#1441 opened by rasbt - 2
Some confusion about weight conversion, as I need to use other engineering to evaluate my LLM
#1436 opened by fireyanci - 3
- 0
pretrain custom dataset gpu memory oom
#1432 opened by wen020 - 4
Resolve output characters garbled
#1422 opened by fireyanci - 1
Is there any best practice for using litdata to load custom data for pretraining?
#1428 opened by wen020 - 5
Continually pretrained Llama2-7B-hf model inference is not working on 16GB GPU machine
#1423 opened by karkeranikitha - 4
how to pretrain llama2?
#1418 opened by wen020 - 4
prompt_style
#1416 opened by fireyanci - 1
how to pretrain llama2 in custom data?
#1427 opened by wen020 - 3
Stream option
#1420 opened by rasbt - 0
Python API
#1419 opened by rasbt - 4
- 1
- 2
Lora recipes use lots of memory because of not wrapping parameters with gradient in separate FSDP unit
#1417 opened by RuABraun - 3
Pretraining example from readme fails in Colab
#1402 opened by AndisDraguns - 0
support for qwen2 and baichuan
#1411 opened by bestpredicts - 1
- 2
Redundancy?
#1408 opened by rasbt - 7
Streamline LitGPT API
#1403 opened by rasbt - 2
- 0
Remove old and unused LLMs
#1401 opened by rasbt - 0
LoRA matrices dropout
#1398 opened by belerico - 0
how to solve this debug
#1394 opened by Learneducn - 3
- 2
Cannot copy out of meta tensor; no data!
#1378 opened by Gooooooogo - 1
- 2
Customizable loss function & inference step?
#1388 opened by Boltzmachine - 1
- 0
How to use custom dataset for evaluate?
#1383 opened by Gooooooogo - 2
How to specify which GPU to use?
#1379 opened by Gooooooogo - 0
After some iteration in pretraining a LLM, IndexError is raised related to dataset chunking
#1377 opened by MusulmonLolayev - 4
Why FSDPStrategy is so slow-down when I use multi-machine
#1369 opened by Graduo