NVIDIA/NeMo-Aligner

Scalable toolkit for efficient model alignment

PythonApache-2.0

Issues

Fix `dev` branch's build after PTL upgrade
#418 opened 5 days ago by terrykong
0
Will you support online DPO?
#414 opened 7 days ago by Shiguang-Guo
0
serve_reward_model goes down
#351 opened a month ago by AtsunoriFujita
3
SFT not working on nemo:24.05.01 container
#236 opened 4 months ago by vecorro
3
How can I use nvidia/Llama-3.1-Nemotron-70B-Reward-HF directly for inference?
#360 opened a month ago by arunasank
4
Unable to pip install nemo-aligner
#342 opened 2 months ago by SCccc21
1
`attribute_annotate.py` is not worked by KeyError: 'exceeded'
#349 opened a month ago by AtsunoriFujita
0
LD_LIBRARY_PATH override in dockerfile causes failure in CI
#336 opened a month ago by terrykong
4
[Question] Converting a Megatron-LM ckpt to nemo so we can use NeMo-Aligner for post-training
#340 opened 2 months ago by abgoswam
0
job hangs or IndexError when train reward model with PP> 1
#251 opened 4 months ago by zirui
6
Error during saving checkpoint with TensorRT-enabled PPO actor training
#281 opened 3 months ago by haizadinia
2
add packed dataset
#181 opened 6 months ago by gshennvm
2
[Question] TransfomerEngine and Apex dependencies
#278 opened 3 months ago by peri044
0
make build_dataloader not take in cfg
#273 opened 3 months ago by gshennvm
0
common class for aligner models
#272 opened 3 months ago by gshennvm
0
Request for Context Parallel Support in MegatronGPTDPOModel
#271 opened 3 months ago by Wolfwjs
0
GPTGenerateTRTLLM.trt_llm_exporter.refit failed due to empty weights in the refit engine during PPO actor training
#264 opened 4 months ago by renweizhukov
1
Does NeMo Aligner support tensor parallel and pipeline parallel?
#265 opened 3 months ago by cizhenshi
0
cannot load reward model from SFT model because of missing keys
#137 opened 8 months ago by DZ9
10
Support converting HF reward models to .nemo
#115 opened 9 months ago by odelalleau
2
reward-bench for Reward Model
#230 opened 5 months ago by lss11005
1
DPO Training error: NameError: name 'RetroConfig' is not defined
#262 opened 4 months ago by sunilitggu
1
DPO Training error: NameError: name 'RetroConfig' is not defined
#261 opened 4 months ago by sunilitggu
0
How to shuffle data before the start of each epoch?
#250 opened 4 months ago by Cppowboy
0
Different performance from TRL DPO
#243 opened 4 months ago by Cppowboy
1
Can you support KTO?
#143 opened 8 months ago by lifan-yuan
2
better add_BOS and add_EOS support in reward models
#231 opened 5 months ago by gshennvm
0
Policy Log Probs and Reference Log Probs differ at 1st iteration of DPO/RPO
#227 opened 5 months ago by shengyangs
0
LoRA for Reward Model Training
#225 opened 5 months ago by bugsz
0
Results do not reproduce between self-hosted and hosted rewards model.
#217 opened 5 months ago by noamgai21
5
Tutorial / Example for Single Node FP8 Inference?
#216 opened 5 months ago by noamgat
0
Multiple training file support
#207 opened 5 months ago by seanliu96
1
how to fine-tune Qwen1.5 models based on Nemo
#175 opened 6 months ago by panjianfei
4
how to fine-tune models with multi-nodes
#185 opened 6 months ago by panjianfei
2
`cfg` in RLHFdataset doesn't have `length_params`
#173 opened 6 months ago by joonkeekim
1
Some code related to `train_valid_test_num_samples` may be wrong / unused
#176 opened 6 months ago by odelalleau
0
Amend SPIN to be able to handle the cast of rollout_MBS < DP_size
#171 opened 7 months ago by trias702
0
Docker build failing. Also, is there a .nemo reward model file available?
#167 opened 7 months ago by rundiffusion
4
Can you please support context parallel?
#162 opened 7 months ago by DZ9
0
SFT does not work `max_steps`
#159 opened 7 months ago by AtsunoriFujita
2
SFT `val_check_interval` should accept float input
#160 opened 7 months ago by AtsunoriFujita
0
triton server CANNOT work with nemo under python3.10, make it impossible to train a ppo using reward/critic server
#147 opened 7 months ago by DZ9
4
DPO crashes with `micro_batch_size > 1`
#153 opened 7 months ago by odelalleau
0
SFT is broken with container 24.01.01
#131 opened 8 months ago by odelalleau
1
`GPTSFTModel.generate()` crashes with PP>2
#145 opened 8 months ago by odelalleau
0
SFT may crash if input data exceeds the context length
#127 opened 8 months ago by odelalleau
1
random samplers keeps state
#107 opened 9 months ago by gshennvm
0
PPOTrainer ignores limit_val_batches when running validation
#99 opened 10 months ago by gleibovich-nvidia
2
Add support for `drop_last=False`
#96 opened 10 months ago by odelalleau
4
Changing `num_rollout_samples` modifies the validation set in PPO
#90 opened 10 months ago by odelalleau
0