vysheng/tg

bug: unable to get history starting from 6000th message

tux-mind opened this issue · 8 comments

I'm writing a little script that save all media exchanged with a peer.

but i cannot get over 6000 messages, why ?

that's the script output:

workstation-max tg >./bin/telegram-cli -Z backup.py                                                              [0]
Telegram-cli version 1.3.3, Copyright (C) 2013-2015 Vitaly Valtman
Telegram-cli comes with ABSOLUTELY NO WARRANTY; for details type `show_license'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show_license' for details.
Telegram-cli uses libtgl version 2.0.3
Telegram-cli includes software developed by the OpenSSL Project
for use in the OpenSSL Toolkit. (http://www.openssl.org/)
Telegram-cli uses libpython version 3.4.1
I: config dir=[/home/max/.telegram-cli]
 *** Python Initialized
> parsed 100 messages
> parsed 200 messages
> parsed 300 messages
> parsed 400 messages
> parsed 500 messages
> parsed 600 messages
> parsed 700 messages
> parsed 800 messages
> parsed 900 messages
> parsed 1000 messages
> parsed 1100 messages
> parsed 1200 messages
> parsed 1300 messages
> parsed 1400 messages
> parsed 1500 messages
> parsed 1600 messages
> parsed 1700 messages
> parsed 1800 messages
> parsed 1900 messages
> parsed 2000 messages
> parsed 2100 messages
> parsed 2200 messages
> parsed 2300 messages
> parsed 2400 messages
> parsed 2500 messages
> parsed 2600 messages
> parsed 2700 messages
> parsed 2800 messages
> parsed 2900 messages
> parsed 3000 messages
> parsed 3100 messages
> parsed 3200 messages
> parsed 3300 messages
> parsed 3400 messages
> parsed 3500 messages
> parsed 3600 messages
> parsed 3700 messages
> parsed 3800 messages
> parsed 3900 messages
> parsed 4000 messages
> parsed 4100 messages
> parsed 4200 messages
> parsed 4300 messages
> parsed 4400 messages
> parsed 4500 messages
> parsed 4600 messages
> parsed 4700 messages
> parsed 4800 messages
> parsed 4900 messages
> parsed 5000 messages
> parsed 5100 messages
> parsed 5200 messages
> parsed 5300 messages
> parsed 5400 messages
> parsed 5500 messages
> parsed 5600 messages
> parsed 5700 messages
> parsed 5800 messages
> parsed 5900 messages
> parsed 6000 messages

than the script hang on...

value returned by peer.history is always true.

thanks in advance for your help.

I can confirm this, my script doesn't even get further than 3000 for groups. The callback simply isn't called anymore beyond that point. This is also the case with the lua binding when you request 3000 messages or more from a group (this can be demonstrated with telegram-cli-backup).

I suspected it was a Telegram API limitation at first, but I tested with other clients (official Android and Telegram desktop) and they have no trouble replaying all the way back.

You don't even have trouble replaying all the way back in tg-cli, it can't be a tg-cli bug. I once recalled a ~220.000 messages history at once with no problem, it just needed to load ~3 Minutes. Might it be a memory limitation in your script? I don't know your code.

On 10 Aug 2015, at 12:01 am, Tim van der Staaij notifications@github.com wrote:

I can confirm this, my script doesn't even get further than 3000 for groups. The callback simply isn't called anymore beyond that point. This is also the case with the lua binding when you request 3000 messages or more from a group (this can be demonstrated with telegram-cli-backup).

I suspected it was a Telegram API limitation at first, but I tested with other clients (official Android and Telegram desktop) and they have no trouble replaying all the way back.


Reply to this email directly or view it on GitHub.

My script just successfully pulled a ~70k group history after adding a time.sleep(1) between the 100-message history requests (and anything faster than 100 messages/second fails). So apparently it boils down to a throttling problem.

This issue is a bug nonetheless, since the callback is never delivered to the script. The callback should of course always be dispatched, even on failure; that's what the success parameter is for. I suspect this problem is rooted in tgl itself, because:

  • It applies to both the Python and Lua bindings.
  • Requesting a large history without an offset from the command line doesn't give a response either. @LukeLR can you still retrieve such a large history without using an offset? It might be the case that the Telegram API has a rate limitation now that it didn't have before and tgl chokes on it.

Also, this history example in README-PY.md suggests that the history can be retrieved in one go without rate limiting. @Surye I suggest adding a note about this practical limitation.

@tvdstaaij and @LukeLR can you pls tell me more about what script you are using to getting chat history? I used telegram-cli-backup but it has a limitation. pls share your script, Thank you guys beforehand. Have a nice day!

@Hushnud My latest script is telegram-history-dump, which supports various output formats and is extensible with additional formats. I use it as part of my scripts to automatically generate group chat statistics every night. It can handle 100k+ messages and also media file downloads without a problem, it just takes a while, because it inserts delays to prevent being cut off by Telegram servers.

Thank you Tim @tvdstaaij for your quick response, i'm appreciated. I installed ruby 2 and json5.
Sorry for my misunderstanding but i need to run telegram-cli --json -P 9009 and after that telegram-history-dump.rb? i mean in same terminal window? i copied telegram-history-dump to home folder. thank you again for your help Tim. Have a great day!

@Hushnud Telegram-cli has to be running while telegram-history-dump does its job. So you can start telegram-cli in one terminal window/tab and then start telegram-history-dump in another window/tab.

Alternatively you can do it in one window with this variation:

telegram-cli --json -P 9009 >/dev/null &
ruby telegram-history-dump.rb -k

This way telegram-cli will run in the background and telegram-history-dump will shut down telegram-cli when the backup is done.

Many many thanks for you Tim @tvdstaaij , Finally I got history. The main thing is i understood how code works. Thank you again man. It's pleasure to met person like you, who helping people. yeah definitely. Good luck and god bless you!

with
kind regards
Hushnud