Issues
- 0
Fix `dev` branch's build after PTL upgrade
#418 opened by terrykong - 0
Will you support online DPO?
#414 opened by Shiguang-Guo - 3
serve_reward_model goes down
#351 opened by AtsunoriFujita - 3
SFT not working on nemo:24.05.01 container
#236 opened by vecorro - 4
How can I use nvidia/Llama-3.1-Nemotron-70B-Reward-HF directly for inference?
#360 opened by arunasank - 1
Unable to pip install nemo-aligner
#342 opened by SCccc21 - 0
- 4
- 0
[Question] Converting a Megatron-LM ckpt to nemo so we can use NeMo-Aligner for post-training
#340 opened by abgoswam - 6
- 2
- 2
add packed dataset
#181 opened by gshennvm - 0
[Question] TransfomerEngine and Apex dependencies
#278 opened by peri044 - 0
make build_dataloader not take in cfg
#273 opened by gshennvm - 0
common class for aligner models
#272 opened by gshennvm - 0
- 1
GPTGenerateTRTLLM.trt_llm_exporter.refit failed due to empty weights in the refit engine during PPO actor training
#264 opened by renweizhukov - 0
- 10
- 2
Support converting HF reward models to .nemo
#115 opened by odelalleau - 1
reward-bench for Reward Model
#230 opened by lss11005 - 1
- 0
- 0
How to shuffle data before the start of each epoch?
#250 opened by Cppowboy - 1
Different performance from TRL DPO
#243 opened by Cppowboy - 2
Can you support KTO?
#143 opened by lifan-yuan - 0
better add_BOS and add_EOS support in reward models
#231 opened by gshennvm - 0
Policy Log Probs and Reference Log Probs differ at 1st iteration of DPO/RPO
#227 opened by shengyangs - 0
LoRA for Reward Model Training
#225 opened by bugsz - 5
- 0
Tutorial / Example for Single Node FP8 Inference?
#216 opened by noamgat - 1
Multiple training file support
#207 opened by seanliu96 - 4
how to fine-tune Qwen1.5 models based on Nemo
#175 opened by panjianfei - 2
how to fine-tune models with multi-nodes
#185 opened by panjianfei - 1
`cfg` in RLHFdataset doesn't have `length_params`
#173 opened by joonkeekim - 0
- 0
- 4
Docker build failing. Also, is there a .nemo reward model file available?
#167 opened by rundiffusion - 0
Can you please support context parallel?
#162 opened by DZ9 - 2
SFT does not work `max_steps`
#159 opened by AtsunoriFujita - 0
- 4
triton server CANNOT work with nemo under python3.10, make it impossible to train a ppo using reward/critic server
#147 opened by DZ9 - 0
DPO crashes with `micro_batch_size > 1`
#153 opened by odelalleau - 1
SFT is broken with container 24.01.01
#131 opened by odelalleau - 0
`GPTSFTModel.generate()` crashes with PP>2
#145 opened by odelalleau - 1
- 0
random samplers keeps state
#107 opened by gshennvm - 2
- 4
Add support for `drop_last=False`
#96 opened by odelalleau - 0