Exposing Token Log Probabilities / Conditional Probability

Question

Exposing Token Log Probabilities / Conditional Probability

Closed this issue 2 months ago · 6 comments

Describe the feature

I need to assess arbitrary input and score it based on its output probability in context (AKA the normalized sum of each token's conditional probability). I'm not sure how hard this would be, but it would be nice if I could get just the token probabilities because right now the only interfaces with the model are to tokenize or get a completion. Thanks for your time and all your effort on this project!

Answer 1 · 2024-09-12T15:55:28.000Z

I think this would be a measure similar to perplexity.

Answer 2 · 2024-09-12T15:57:24.000Z

Hi! This sounds too specific for the purpose of this project.

The llama.cpp can return the following:
n_probs: If greater than 0, the response also contains the probabilities of top N tokens for each generated token given the sampling settings. Note that for temperature < 0 the tokens are sampled greedily but token probabilities are still being calculated via a simple softmax of the logits without considering any other sampler settings. Default: 0
is that what you'd like to have?

If yes you can select the nProbs from the LLMCharacter and then implement a function to get the predictions.
You can have a look at these functions and adapt them to get the returned probabilities:

Chat the function that calls the completion and handles the result https://github.com/undreamai/LLMUnity/blob/main/Runtime/LLMCharacter.cs#L491-L533
ChatContent: result handling without streaming https://github.com/undreamai/LLMUnity/blob/main/Runtime/LLMCharacter.cs#L419-L423
MultiChatContent: result handling with streaming https://github.com/undreamai/LLMUnity/blob/main/Runtime/LLMCharacter.cs#L425-L434

Answer 3 · 2024-09-12T16:16:34.000Z

Of course I totally understand if you don't think it fits this project, but I don't think that's quite what I mean.
Basically for the calculation of perplexity I need the probability of one token given all previous tokens. But the catch is that the token is generated by the user.

So something along the lines of, given the context: "He went to the", then the probability of "store" would be higher than the probability of "pumpkin".

As I said, I know this is a bit out of the scope of this project. n-probs is close, but if the user's input isn't one of the top N tokens it will always return 0, and to get n-probs for a user generated sentence I would need to generate one token at a time with a whole bunch of API calls. I suppose I could try generating something like the top 100 tokens but then it will take forever. I guess this is just a hard problem. Thanks for any advice you can provide.

Answer 4 · 2024-09-12T16:32:07.000Z

Could you ask at the llama.cpp GitHub?
This isn't something llama.cpp exposes to my understanding - it doesn't mean I'm correct 🙂 .

Answer 5 · 2024-09-12T16:48:10.000Z

Yeah definitely, sorry to bother you with this.

Also upon further thought n-probs might actually work for my use case so thank you so much for guiding me down the right path.

Answer 6 · 2024-09-12T17:38:53.000Z

Not at all, thanks for the request!
I just have limited capacity (PRs are more than welcome 🤗)
I'll close the issue but feel free to reopen it if you get stuck 🙂