Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: The model frequently generates repetitive token sequences. #368

Open
Razaghallu786 opened this issue Dec 18, 2024 · 4 comments
Labels
component:examples Issues/PR referencing examples folder status:triaged Issue/PR triaged to the corresponding sub-team type:bug Something isn't working

Comments

@Razaghallu786
Copy link

Description of the bug:

No response

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

@Razaghallu786
Copy link
Author

Bug Report: Repetitive Token Generation in "gemini-1.5-flash" Model

Description of the Bug:
When generating long texts using the "gemini-1.5-flash" model, repetitive token sequences frequently occur, resulting in infinite loops and exhausting the token limit. This behavior is consistent across both the Vertex and Gemini APIs.


Example:

"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be

Steps to Reproduce:

  1. Use the "gemini-1.5-flash" model via Vertex or Gemini API.

  2. Generate a long text (e.g., a legal or technical document).

  3. Observe the generated output for repeated phrases or sentences.


Expected Behavior:
The model should produce coherent, non-repetitive text.

Actual Behavior:
The model enters a repetitive loop, generating the same token sequences indefinitely until the token limit is reached.


Impact:

Resource Waste: Tokens are wasted, increasing costs and exhausting API usage limits.

Output Quality: The generated text becomes unusable, requiring additional API requests.


Reproduction Rate:
Occurs frequently when generating long-form text.


Workaround:
There is currently no known workaround to prevent this issue.


Request for Resolution:

  1. Investigate and resolve the cause of repetitive token generation.

  2. Implement a mechanism to detect and avoid repetitive loops during generation.

  3. Consider offering refunds or credits for tokens wasted due to this bug.


Actual vs. Expected Behavior:
Actual Output:

"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly. The judgment can be appealed..."

Expected Output:
"The judgment can be appealed in a motion for reconsideration, claiming that the judge did not consider the evidence properly."

@Vitalina12512
Copy link


@gmKeshari gmKeshari added type:bug Something isn't working status:triaged Issue/PR triaged to the corresponding sub-team component:examples Issues/PR referencing examples folder labels Dec 19, 2024
@gmKeshari
Copy link

Hi @Razaghallu786,

Could you please provide a bit more clarification on this? Is this happening with some features like function calling or structured output or just simply running the above prompt??

@Giom-V
Copy link
Collaborator

Giom-V commented Dec 21, 2024

Which temperature are you using? If you are using 0, can you try with a higher one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:examples Issues/PR referencing examples folder status:triaged Issue/PR triaged to the corresponding sub-team type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants