turboderp/exui

Determinism

Closed this issue · 2 comments

It seams like exui sets the torches seed to always be the same value in a single chat as the responses are deterministic.

When in a chat deleting the response and regenerating it will always result in the same response, as dose the "Regenrate" menu option under "More" makeing this button pretty pointless.

I would expect the fixed seed for the chat to be something one can set as an option.

notable the notepad dose not behave like this, it instead is non-deterministic

The chat mode doesn't use a fixed seed. Likely what you're seeing is just the more predictable nature of instruct-tuned models that tend to have strongly preferred responses sometimes. In the notepad mode you'd not be using the instruct template unless you're manually entering it, which tends to give you a broader distribution to sample from.

I have no problem getting different responses, e.g. with Gemma2-27B-it, temperature = 1.5, temperature last, top-P = 0.9 and the prompt What's up with dogs?:

  • That's a great question! "What's up with dogs" is pretty broad, so to give you a good answer, I need you to be a little more specific. ...
  • Ah, dogs! What's not to love about them? They're pretty amazing creatures, aren't they? 🐶❤️ ...
  • That's a great question! "What's up with dogs?" could mean a lot of things! ...
  • That's a great question! "What's up with dogs?" invites so many answers! To give you a good one, I need you to be a little more specific. ...
  • etc.

Gemma2 really likes to compliment you on your great questions, because it thinks you're just awesome, but that's how alignment works. There is still a distribution to sample from, it's just biased towards certain responses. When it comes to refusals and such, that bias can be so strong as to seem deterministic (in an attempt to make the model "safe") but it's definitely not.

The default sampling settings are somewhat conservative so you could try messing with the cutoffs to get more varied responses.

Indeed false alarm it seams.