Lightning-AI/litgpt

Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.

PythonApache-2.0

Issues

Finetuning with multiple gpus extemely slow
#1472 opened a day ago by SergioG-M
0
Command -> litgpt download openlm-research/open_llama_13b gives error: Unrecognized arguments: openlm-research/open_llama_13b
#1471 opened a day ago by VamsiYK
3
Support non-int batch_size argument "auto" with litgpt evaluate
#1467 opened 2 days ago by ebektas
3
Continue finetuning
#1464 opened 3 days ago by SergioG-M
0
LR scheduler can result in a division by 0
#1393 opened 3 days ago by carmocca
0
Finetune lora max_seq_length error
#1461 opened 4 days ago by SergioG-M
4
Create new CI API key
#1433 opened 4 days ago by carmocca
1
Missing python dependencies for running pretrain tutorial
#1458 opened 5 days ago by morphpiece
0
performing continuous pretraining and then finetuning causes error
#1430 opened 18 days ago by richardzhuang0412
3
The difference between FSDPStrategy and DeepSpeedStrategy during pre-training
#1452 opened 10 days ago by wen020
0
Using custom data for `Continue pretraining an LLM`
#1450 opened 11 days ago by SimiPixel
4
validation output during finetuning
#1443 opened 11 days ago by richardzhuang0412
2
Mixtral 8x22B support
#1448 opened 13 days ago by SergioG-M
2
mistralai/Mistral-7B-v0.3 support
#1444 opened 15 days ago by karkeranikitha
2
Training lasts just 150 seconds for TinyLlama OpenWebtext dataset
#1447 opened 14 days ago by srivassid
0
Specify cache for huggingface openwebtext download
#1446 opened 14 days ago by srivassid
1
How to set max_iters
#1445 opened 15 days ago by srivassid
5
Upgrade LitData
#1441 opened 15 days ago by rasbt
0
Some confusion about weight conversion, as I need to use other engineering to evaluate my LLM
#1436 opened 16 days ago by fireyanci
2
Will CycleIterator forward to dataset on resume for pretrain?
#1386 opened a month ago by calvintwr
3
pretrain custom dataset gpu memory oom
#1432 opened 17 days ago by wen020
0
Resolve output characters garbled
#1422 opened 25 days ago by fireyanci
4
Is there any best practice for using litdata to load custom data for pretraining?
#1428 opened 19 days ago by wen020
1
Continually pretrained Llama2-7B-hf model inference is not working on 16GB GPU machine
#1423 opened 24 days ago by karkeranikitha
5
how to pretrain llama2?
#1418 opened 19 days ago by wen020
4
prompt_style
#1416 opened 19 days ago by fireyanci
4
how to pretrain llama2 in custom data?
#1427 opened 19 days ago by wen020
1
Stream option
#1420 opened a month ago by rasbt
3
Python API
#1419 opened a month ago by rasbt
0
Continue pre-training got RuntimeError: Failed processing /tmp/data
#1413 opened a month ago by BestJiayi
4
Address frozen parameter warning with FSDP on nightly torch
#1392 opened a month ago by carmocca
1
Lora recipes use lots of memory because of not wrapping parameters with gradient in separate FSDP unit
#1417 opened a month ago by RuABraun
2
Pretraining example from readme fails in Colab
#1402 opened a month ago by AndisDraguns
3
support for qwen2 and baichuan
#1411 opened a month ago by bestpredicts
0
'Phi-3-mini-4k-instruct' is not a supported config name
#1412 opened a month ago by georgehu0815
1
Redundancy?
#1408 opened a month ago by rasbt
2
Streamline LitGPT API
#1403 opened a month ago by rasbt
7
test_tinyllama issue with LitData and `iterate_over_all`
#1399 opened a month ago by Andrei-Aksionov
2
Remove old and unused LLMs
#1401 opened a month ago by rasbt
0
LoRA matrices dropout
#1398 opened a month ago by belerico
0
how to solve this debug
#1394 opened a month ago by Learneducn
0
LoRA multi-GPU no longer works if applying LoRA selectively
#1385 opened a month ago by awaelchli
3
Cannot copy out of meta tensor; no data!
#1378 opened a month ago by Gooooooogo
2
fabric.print only works on sys.stderr, does not print inference result
#1384 opened a month ago by lastmjs
1
Customizable loss function & inference step?
#1388 opened a month ago by Boltzmachine
2
ValueError: Cannot attend to 3063, block size is only 2048
#1387 opened a month ago by Gooooooogo
1
How to use custom dataset for evaluate?
#1383 opened a month ago by Gooooooogo
0
How to specify which GPU to use？
#1379 opened a month ago by Gooooooogo
2
After some iteration in pretraining a LLM, IndexError is raised related to dataset chunking
#1377 opened a month ago by MusulmonLolayev
0
Why FSDPStrategy is so slow-down when I use multi-machine
#1369 opened a month ago by Graduo
4