AssertionError: Sentences lengths should not exceed max_tokens=400000

Question

AssertionError: Sentences lengths should not exceed max_tokens=400000

BirdiD opened this issue 3 months ago · 1 comments

❓ Questions and Help

Hello everyone,
I am trying to perform audio pretraining task with data2vec. I keep getting the following error “ AssertionError: Sentences lengths should not exceed max_tokens=400000”. I have tried modifying the trainer.py as suggested here : #4759 but this does not resolve the issue.

Has someone already come accross this error ? Also, I don’t understand why there is an error on sentences length since I am performing an audio pretraining task and not an ASR ?

Thanks in advance for your help !

Answer 1 · 2024-03-03T09:19:19.000Z

Just removing max--tokens argument for audio pretraining solve the issue