Lightning-AI/litgpt
Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
PythonApache-2.0
Issues
- 0
LR scheduler can result in a division by 0
#1393 opened by carmocca - 1
Create new CI API key
#1433 opened by carmocca - 3
- 0
- 4
Using custom data for `Continue pretraining an LLM`
#1450 opened by SimiPixel - 2
validation output during finetuning
#1443 opened by richardzhuang0412 - 2
Mixtral 8x22B support
#1448 opened by SergioG-M - 2
mistralai/Mistral-7B-v0.3 support
#1444 opened by karkeranikitha - 0
- 1
Specify cache for huggingface openwebtext download
#1446 opened by srivassid - 5
How to set max_iters
#1445 opened by srivassid - 0
Upgrade LitData
#1441 opened by rasbt - 2
Some confusion about weight conversion, as I need to use other engineering to evaluate my LLM
#1436 opened by fireyanci - 3
- 0
pretrain custom dataset gpu memory oom
#1432 opened by wen020 - 4
Resolve output characters garbled
#1422 opened by fireyanci - 1
Is there any best practice for using litdata to load custom data for pretraining?
#1428 opened by wen020 - 5
Continually pretrained Llama2-7B-hf model inference is not working on 16GB GPU machine
#1423 opened by karkeranikitha - 4
how to pretrain llama2?
#1418 opened by wen020 - 4
prompt_style
#1416 opened by fireyanci - 1
how to pretrain llama2 in custom data?
#1427 opened by wen020 - 3
Stream option
#1420 opened by rasbt - 0
Python API
#1419 opened by rasbt - 4
- 1
- 2
Lora recipes use lots of memory because of not wrapping parameters with gradient in separate FSDP unit
#1417 opened by RuABraun - 3
Pretraining example from readme fails in Colab
#1402 opened by AndisDraguns - 0
support for qwen2 and baichuan
#1411 opened by bestpredicts - 1
- 2
Redundancy?
#1408 opened by rasbt - 7
Streamline LitGPT API
#1403 opened by rasbt - 2
- 0
Remove old and unused LLMs
#1401 opened by rasbt - 0
LoRA matrices dropout
#1398 opened by belerico - 0
how to solve this debug
#1394 opened by Learneducn - 3
- 2
Cannot copy out of meta tensor; no data!
#1378 opened by Gooooooogo - 1
- 2
Customizable loss function & inference step?
#1388 opened by Boltzmachine - 1
- 0
How to use custom dataset for evaluate?
#1383 opened by Gooooooogo - 7
litgpt download doesn't work
#1363 opened by natanloterio - 2
How to specify which GPU to use?
#1379 opened by Gooooooogo - 0
After some iteration in pretraining a LLM, IndexError is raised related to dataset chunking
#1377 opened by MusulmonLolayev - 4
Why FSDPStrategy is so slow-down when I use multi-machine
#1369 opened by Graduo - 5
A potential bug for multi-GPU training
#1368 opened by zyushun - 4
Failed to load the finetuned model with `AutoModelForCausalLM.from_pretrained(name, state_dict=state_dict)`
#1362 opened by zhaosheng-thu - 1
Add support for memory-efficient and faster optimizers
#1364 opened by rasbt - 0
combine FSDP with selective activation checkpointing
#1366 opened by nemoramo - 1
Conversion to HF checkpoint should generate a checkpoint format that can be loaded directly
#1359 opened by awaelchli