n3d1117/chatgpt-telegram-bot

Handle responses longer than telegram message limit

AlexHTW opened this issue ยท 12 comments

Hey, I encountered the telegram error Message_too_long when transcribing a 6 minute audio file.
Any chance to split responses longer than the limit (I think 4096 characters) into multiple messages?
Might also apply to chatGPT responses although I have not managed to get such a long response yet but I think theoretically it could be possible (4096 tokens > 4096 characters).

Thank you and the other contributors for all your hard work!

Hi @AlexHTW, is this a Telegram or a ChatGPT limitation?

I don't have much time, but I added to the README in case someone wants to give it a try!

Hey, it's a Telegram limitation. We would just need to split the response into multiple messages if response is larger than the Telegram limit. Thanks for adding it, maybe I can figure it out and contribute.

@AlexHTW this should be fixed with #76. Can you confirm this?

Hey @n3d1117, thanks.
Yeah I managed to get split messages for a 30 minute voice message and also from prompting for very long answers. So I would say everything works as intended.

I tested with 50-60 minute podcast mp3s and couldn't get a transcript, unfortunately also no log message. Maybe it's an API limit or something about the files itself. Don't think it has to do with this PR though, so I would say ship it ๐Ÿš€

I noticed I don't receive a "new transcribe request received" info log when sending the bot podcast mp3s.
On the main branch I get the info log.
53min example file: https://content.blubrry.com/takeituneasy/lex_ai_noam_chomsky_2.mp3

@AlexHTW you're right. Bots can download files of up to 20MB in size (source) and the .mp3 you linked is 38,1MB.

There was actually an exception raised in the context.bot.get_file which wasn't handled. Should be improved now with f7bb416.

BTW I tried compressing your mp3 using an online tool, the download then worked but I got:

openai.error.APIError: Maximum content size limit (26214400) exceeded (50819087 bytes read) {
  "error": {
    "message": "Maximum content size limit (26214400) exceeded (50819087 bytes read)",
    "type": "server_error",
    "param": null,
    "code": null
  }
} 413

suggesting there's a limit (~26MB?) for Whisper requests too. I think you're better off splitting your audio into multiple files

Thanks for the investigation! I was not aware of those limits.

I looked up the limitations in the Whisper docs, you are correct:

File uploads are currently limited to 25 MB and the following input file types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm

The user feedback for files too large is important, thanks. I would suggest also telling the user the actual limit of 20MB when the file is too large.

Done! Merging now, thanks for testing

Hey, it looks like the streaming update broke the handling for long messages in chat responses.
After the telegram message limit is reached I get another message "Failed to get response: Message_too_long".

chatgpt-telegram-bot_1  | Traceback (most recent call last):
chatgpt-telegram-bot_1  |   File "/app/bot/telegram_bot.py", line 370, in prompt
chatgpt-telegram-bot_1  |     await context.bot.edit_message_text(content, chat_id=sent_message.chat_id,
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/ext/_extbot.py", line 1453, in edit_message_text
chatgpt-telegram-bot_1  |     return await super().edit_message_text(
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/_bot.py", line 331, in decorator
chatgpt-telegram-bot_1  |     result = await func(*args, **kwargs)  # skipcq: PYL-E1102
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/_bot.py", line 3230, in edit_message_text
chatgpt-telegram-bot_1  |     return await self._send_message(
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/ext/_extbot.py", line 488, in _send_message
chatgpt-telegram-bot_1  |     result = await super()._send_message(
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/_bot.py", line 512, in _send_message
chatgpt-telegram-bot_1  |     result = await self._post(
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/_bot.py", line 419, in _post
chatgpt-telegram-bot_1  |     return await self._do_post(
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/ext/_extbot.py", line 306, in _do_post
chatgpt-telegram-bot_1  |     return await super()._do_post(
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/_bot.py", line 450, in _do_post
chatgpt-telegram-bot_1  |     return await request.post(
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/request/_baserequest.py", line 165, in post
chatgpt-telegram-bot_1  |     result = await self._request_wrapper(
chatgpt-telegram-bot_1  |   File "/usr/local/lib/python3.9/site-packages/telegram/request/_baserequest.py", line 328, in _request_wrapper
chatgpt-telegram-bot_1  |     raise BadRequest(message)
chatgpt-telegram-bot_1  | telegram.error.BadRequest: Message_too_long

example prompt: list all presidents of the United States with a short biography and list of their major achievements.

Good catch @AlexHTW, I got a fix and i'll push it tomorrow

Actually just pushed it now @AlexHTW, could you do a git pull and test it? bc6a4e4

Awesome. Tested it, works great :) Thank you