Issues
- 0
- 0
执行hf转megatron格式报错了
#70 opened by Lilypad97 - 0
- 0
When training BERT, ERROR: "AttributeError: 'FullTokenizer' object has no attribute 'save_pretrained' "
#68 opened by yuzhiguo07 - 0
- 0
- 0
No update for a long time
#65 opened by dong-liuliu - 0
Has llama2 GQA been supported yet?
#64 opened by JiwenJ - 0
请问有dingding群聊,或者微信群吗?可以沟通的
#63 opened by felix0080 - 0
Llama 3 Support
#62 opened by john-theo - 1
请问是否支持从0训练一个小规模的LLaMA模型,如:1B
#59 opened by liubo12 - 4
llama中decoder layer层里面的MLP问题
#50 opened by yuanzhoulvpi2017 - 8
Unable to import Megatron
#51 opened by fyf2016 - 2
- 0
About batch_size
#61 opened by tszslovewanpu - 4
Megatron-LM权重转hf格式
#52 opened by Yang-QW - 0
sh LLaMA2_7B_standalone.sh
#60 opened by yangzhipeng1108 - 1
请问目前Megatron-LLaMA支持LLaMA2-70B的训练吗?
#45 opened by 13416157913 - 0
在模型转换权重时遇到了如下问题 Zarr-based strategies will not be registered because of missing packages
#57 opened by ZhangEnmao - 4
llama2-34b shape不匹配
#21 opened by cdj0311 - 1
使用distributed optimzer时grad_norm计算准确度的疑问
#56 opened by chivychao - 3
- 0
- 5
问下readme中32机的吞吐对应的参数可以提供下吗,目前没有复现出来
#49 opened by jianzi123 - 1
求一份Serving的教程代码
#48 opened by xealml - 0
hf权重转换代码小bug
#47 opened by yuanzhoulvpi2017 - 1
大家好,请教一个关于GLOBAL_BATCH_SIZE值计算的问题,希望大家不吝赐教。
#35 opened by 13416157913 - 1
INT4 量化的模型可以被Megatron-LLaMA支持吗?
#46 opened by Jeff123z - 1
对于不同参数模型,如何通过配置参数信息计算显存占用大小?
#36 opened by 13416157913 - 5
每次GA的backward都需要做通信
#42 opened by jingjie01ai - 2
是否兼容sequence parallel
#44 opened by jingjie01ai - 0
CUDA_DEVICE_MAX_CONNECTIONS 设置问题
#43 opened by Richie-yan - 5
请教下为什么使用overlapped_distributed_optimizer后,CUDA_DEVICE_MAX_CONNECTIONS就可以不为1了?
#26 opened by yinzhijian - 1
fp16的支持问题
#41 opened by XUWeijiang - 4
TypeError: OverlappedDistributedOptimizer.gather_parameters() got an unexpected keyword argument 'skip_if_not_stepped'
#40 opened by Double-bear - 8
OverlappedDistributedOptimizer 支持 pipeline parallelism > 1 和 data parallelism > 1 同时使用吗?
#37 opened by Baibaifan - 4
- 0
- 3
Loss对齐
#31 opened by wuziyou199217 - 1
多节点训练时使用nccl后端,在训练完后,保存检查点时报错
#32 opened by 13416157913 - 6
在A800*8卡的机器上,开启 overlapped-distributed-optimizer 的速度比开启 use-distributed-optimizer 的慢约8%
#22 opened by tingkuanpei - 0
训练llama-30b模型报错是不支持llama-30b模型么?
#30 opened by 13416157913 - 5
- 2
4台A100*8测试,开启 overlapped-distributed-optimizer 的速度比开启 use-distributed-optimizer 慢很多
#27 opened by silingtong123 - 10
nccl通信边界问题?
#17 opened by Baibaifan - 1
请问ParameterSchedule实际上有作用吗?
#25 opened by yinzhijian - 1
- 2
NGC22.08 环境报错。
#23 opened by EthanChen1234 - 7
训练完后,将保存的Megatron格式转成HF格式报错
#20 opened by 13416157913 - 1
deepspeed+megatron+llama,请问作者有试过吗
#19 opened by Chandler-Bing