Bug Report: The model frequently generates repetitive token sequences.

Question

Bug Report: The model frequently generates repetitive token sequences.

Opened this issue 20 days ago · 4 comments

Razaghallu786 commented 20 days ago

Description of the bug:

No response

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

Answer 1 · 2024-12-18T18:21:02.000Z

Bug Report: Repetitive Token Generation in "gemini-1.5-flash" Model

Description of the Bug:
When generating long texts using the "gemini-1.5-flash" model, repetitive token sequences frequently occur, resulting in infinite loops and exhausting the token limit. This behavior is consistent across both the Vertex and Gemini APIs.

Example:

"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be

Steps to Reproduce:

Use the "gemini-1.5-flash" model via Vertex or Gemini API.
Generate a long text (e.g., a legal or technical document).
Observe the generated output for repeated phrases or sentences.

Expected Behavior:
The model should produce coherent, non-repetitive text.

Actual Behavior:
The model enters a repetitive loop, generating the same token sequences indefinitely until the token limit is reached.

Impact:

Resource Waste: Tokens are wasted, increasing costs and exhausting API usage limits.

Output Quality: The generated text becomes unusable, requiring additional API requests.

Reproduction Rate:
Occurs frequently when generating long-form text.

Workaround:
There is currently no known workaround to prevent this issue.

Request for Resolution:

Investigate and resolve the cause of repetitive token generation.
Implement a mechanism to detect and avoid repetitive loops during generation.
Consider offering refunds or credits for tokens wasted due to this bug.

Actual vs. Expected Behavior:
Actual Output:

"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed..."

Expected Output:
"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly."

Answer 2 · 2024-12-18T20:44:24.000Z

Vitalina12512 commented 20 days ago

Answer 3 · 2024-12-19T05:01:29.000Z

Hi @Razaghallu786,

Could you please provide a bit more clarification on this? Is this happening with some features like function calling or structured output or just simply running the above prompt??

Answer 4 · 2024-12-21T22:13:20.000Z

Which temperature are you using? If you are using 0, can you try with a higher one?