EleutherAI/gpt-neox

_forward_step_fn does not always return two values so eval.py breaks if is_pipe_parallel is false

Opened this issue · 2 comments

This call to _forward_step_fn expects two values returned

_, logits = self._forward_step_fn(model=self.model, data_iterator=inps)

The forward_step can return three values

return loss, outputs, metrics

I guess I am seeing this because I have is_pipe_parallel false and that is uncommon. Maybe there needs to be an option not to return metrics.

There are several "fixes" in https://github.com/markNZed/gpt-neox/tree/pipe_parallel_size_1 which might be related to this. I have not had the time to prepare PR but if someone who knows the code base just looks at the changes there I guess they will quickly see many easy to fix issues.

iPRET commented

Can confirm I've run into this issue multiple times aswell, even with pipe parallel size >1.