Issues
- 20
请问adalomo可以支持用transformer中的trainer训练么?或者未来有可能实现么?
#73 opened by lyt719 - 1
One confusion about the LOMO paper
#79 opened by Pairshoe - 4
wandb permission
#23 opened by season1blue - 0
MOE and Custom Finetuning
#77 opened by DRXD1000 - 2
installable package
#74 opened by Borda - 17
- 5
为什么LOMO并没有火起来呢?
#47 opened by Flywolfs - 4
eval environment for opencompass
#65 opened by KaiLv69 - 37
How to load a 65B model on 24G GPU memory?
#71 opened by misonsky - 1
实测LOMO++ deepspeed zero2 7b qlora llama 显存占用 感觉比正常的qlora + deepspeed zero2 显存占用大1倍
#70 opened by zlh1992 - 0
Can you provide detailed dependency versions?
#69 opened by misonsky - 1
Evaluation of Fine-tuned model with Adalomo
#62 opened by sglucas - 2
adalomo在使用chatglm2模型出现错误
#68 opened by JorunoJobana - 5
- 2
Reproduce the results for LOMO
#66 opened by shawnricecake - 15
adalomo optimizer error
#63 opened by shawnricecake - 3
Instructions for evaluation datasets
#64 opened by KaiLv69 - 18
model merge error
#59 opened by shawnricecake - 1
Mistral Support
#58 opened by freegheist - 4
- 1
ModuleNotFoundError: No module named 'rich' after ' python -m pip install rich'
#24 opened by SeekPoint - 2
cannot find adalomo class
#60 opened by Yeojoon - 10
a bug found in save_model of LOMOTrainer
#54 opened by DingQiang2018 - 0
Runtime error on 2nd epoch, trying lora only
#56 opened by wasifferoze - 1
hook函数之所以再需要额外操作一次,那里的描述我觉得有些不妥,也或者我理解有问题
#57 opened by LaosGAmin - 4
Customized loss value
#52 opened by ZN1010 - 2
LlaMA-7B + LoRA在16GB的V100上OOM
#53 opened by zhenqincn - 1
- 1
请问自定义Dataset只能是classification的数据集吗
#51 opened by Tonystark64 - 1
- 4
- 6
LOMO是否支持bfloat16模型的训练?
#46 opened by Wangyupei - 2
CLIP梯度和梯度overflow的影响
#48 opened by tzjtatata - 5
Memory Usage continues to grow
#45 opened by Jetcodery - 16
batch size开2后一直提示gradient overflow。。
#42 opened by 00drdelius - 1
Functions to measure the memory usage
#38 opened by JiaxiangRen - 1
- 1
公式4疑问
#37 opened by yaorong1996 - 1
关于代码理解和显存占用的问题
#41 opened by anbyaa - 7
llama-33B/llama-65B均报OOM,8*V100跑不起来怎么回事呢?
#28 opened by alisyzhu - 2
关于微调llama-65b的疑问
#40 opened by Facico - 9
- 2
- 2
LORA+LOMO distributed learning
#33 opened by JiaxiangRen - 7
LOMO+QLoRA简单更改后的报错
#35 opened by 00drdelius - 1
Key Error: LOCAL_RANK
#31 opened by snykral - 3
about torch.stack(self.grad_norms)
#30 opened by jinzitian - 0
我使用了Resnet50+LOMO优化器,使用cpu去跑,系统内存相比sgd 没有任何变化,请问合理吗
#29 opened by yaocy - 3
- 3
Some confusion about the method of the paper
#27 opened by JorunoJobana