nitmir/btdht

Multiple get_peer requests causes 100% CPU usage

Opened this issue · 7 comments

Hi,

Im playing around with this library and have tried asking it for a small bunch of magnet hashes, in the start it goes without any issues at all. But after a random number of get_peer requests for some random hashes for popular torrents, the CPU usage spikes to 100% and stays there until i kill it.

I have tried to run only one request per 10 second, running it both in blocking and non-blocking mode, increasing the debug level, moving the query limit between 10 and 1000. No matter what changes i have done the issue appears.

I tried to reproduce the issue by making a script that simply polled the ubuntu example hash (and some more hashes based on that hash), but that did not seem to fail.

It appears that the last thing that happens before the lockup is that the library receives a couple of errors:

2984 nodes, 1607 goods, 59 bads | in: 51, out: 60 en 14s

ERROR:202:b'Server Error' pour (None, 1520126208.2239816, b'd1:ad2:id20:\x04Ogj^1+\xd3*Q\x0c\xa8\xdf\xcf\xe2\xea\xc2\x12?\xeae1:q4:ping1:t6:=\x08,\x902M1:y1:qe')
obj of type b'e'
8 nodes added to routing table
5 nodes added to routing table
ERROR:202:b'Server Error' pour (None, 1520126208.227913, b'd1:ad2:id20:\x04Ogj^1+\xd3*Q\x0c\xa8\xdf\xcf\xe2\xea\xc2\x12?\xeae1:q4:ping1:t6:\xd6\x92N\xd8 B1:y1:qe')
obj of type b'e'
8 nodes added to routing table
0 nodes added to routing table
7 nodes added to routing table
4 nodes added to routing table
ERROR:202:b'Server Error' pour (None, 1520126208.266789, b'd1:ad2:id20:\x04Ogj^1+\xd3*Q\x0c\xa8\xdf\xcf\xe2\xea\xc2\x12?\xeae1:q4:ping1:t6:d/ X\xaf\x911:y1:qe')
obj of type b'e'
8 nodes added to routing table
3 nodes added to routing table
8 nodes added to routing table
ERROR:202:b'Server Error' pour (None, 1520126208.3035867, b'd1:ad2:id20:\x04Ogj^1+\xd3*Q\x0c\xa8\xdf\xcf\xe2\xea\xc2\x12?\xeae1:q4:ping1:t6:1yV\x97\xc7F1:y1:qe')
obj of type b'e'
ERROR:202:b'Server Error' pour (None, 1520126208.2280884, b'd1:ad2:id20:\x04Ogj^1+\xd3*Q\x0c\xa8\xdf\xcf\xe2\xea\xc2\x12?\xeae1:q4:ping1:t6:\x8a,\n\x1f\xac\xa21:y1:qe')
obj of type b'e'
ERROR:202:b'Server Error' pour (None, 1520126208.2672093, b'd1:ad2:id20:\x04Ogj^1+\xd3*Q\x0c\xa8\xdf\xcf\xe2\xea\xc2\x12?\xeae1:q4:ping1:t6:x\xb7\xffe\x1c\xc81:y1:qe')
obj of type b'e'
ERROR:202:b'Server Error' pour (None, 1520126208.2678056, b"d1:ad2:id20:\x04Ogj^1+\xd3*Q\x0c\xa8\xdf\xcf\xe2\xea\xc2\x12?\xea6:target20:\x87\xd0\x0f\xdc\xdd'\x01,~(\xe0\x9f\xd4\x1c\\jz\xc6\x12be1:q9:find_node1:t6:\x10\xc5\x9e\xd7\x97'1:y1:qe")
obj of type b'e'
0 nodes added to routing table

I tried some profiling with Yappi, it says that these two calls were the most used, not sure if helpful but..

name ncall tsub ttot tavg
...5/queue.py:90 PollableQueue.empty 187.. 21.10275 42.51144 0.000023
..dist-packages/six.py:580 iteritems 187.. 23.93679 37.79221 0.000020

Any calls for get_peers afterwards will simply yield "no peers or nodes found"

I have not seen any special IO-activity while this happens,

This has been tested on the latest release

What could this be, and how can it be avoided?

After debugging on another machine i receive "Iterator get_peers_closest_loop stoped", but no more information than that. And there are no new incoming request after that.

Do you have some ideas to what the issue might be, @nitmir ?

Hi Ueland,

Sorry I do not have currently time to try to debug. the get_peers_closest_loop correspond to the method _get_peers_closest_loop in the btdht.dht.DHT_BASE class and it should not stop.

Something must make it somehow crash. The method _get_peers_closest_loop yield the next timestamp it want to be executed and the btdht.utils.Scheduler._schedule_loop do the execution by calling next() on the iterator return by the method _get_peers_closest_loop.

It is not surprising that the two most frequent calls are six.iteritems and PollableQueue.empty as this thow function are call on each loop of the scheduler. iteriemps in the _schedule_loop and the other in the _io_loop.

I hope you can find what cause the issue

i've worked around the issue like this ;;

dht = btdht.DHT()                                                                                                                     
dht.start()                                                                                                                 

time.sleep(15)                                                                 

for loop in items:

        tries += 1

        if tries is 10 :
                tries = 0
                print ("Restarting DHT")
                dht.stop()
                time.sleep(3)
                dht = btdht.DHT()
                dht.start()
                time.sleep(15)

Hi

Cloud you provide a minimal code triggering the bug ?
I have tried to reproduce it without success:
Here the code I ran to test get_peer (btdht, bs4 and requests installed in a python 3.9.2 venv with pip). If you use it, remenber to change the value of the myid variable.
It call get_peer every 10 secondes on all the hash listed on the ubuntu tracker https://torrent.ubuntu.com/tracker_index.

import binascii
import btdht
import time
import requests
from bs4 import BeautifulSoup


def parse_ubuntu_tracker(data):
    soup = BeautifulSoup(open('tracker_index').read(), 'html.parser')
    torrents = {}
    for code in soup.find_all('code'):
        tds = code.parent.parent.find_all('td')
        torrents[tds[1].text] = binascii.a2b_hex(tds[0].text)
    return torrents


def announce_peers(name):
    def wrap(peers):
        if peers:
            print(len(peers), 'peers for torrent', name)
        else:
            print(0, 'peers for torrent', name)
    return wrap


def main():
    r = requests.get("https://torrent.ubuntu.com/tracker_index")
    torrents = parse_ubuntu_tracker(r.text)
    myid = binascii.a2b_hex('137e34a1763b99dbfef34fd462305d705d7a4ebd')
    dht = btdht.DHT(id=myid, bind_port=6880, debuglvl=1)
    dht.load()
    dht.start()
    try:
        time.sleep(15)
        while True:
            # for h in hashs:
            for name, h in torrents.items():
                dht.get_peers(
                    h, block=False,
                    callback=announce_peers(name)
                )
            time.sleep(10)
            print("\n\n\n\n")
    except (Exception, KeyboardInterrupt):
        dht.save()
        dht.stop()
        raise


if __name__ == "__main__":
    main()

The output looks like:

50 peers for torrent kubuntu-16.04.6-desktop-amd64.iso
19 peers for torrent kubuntu-16.04.6-desktop-i386.iso
117 peers for torrent kubuntu-18.04.5-desktop-amd64.iso
95 peers for torrent kubuntu-18.04.5-desktop-i386.iso
101 peers for torrent kubuntu-20.04.2.0-desktop-amd64.iso
143 peers for torrent kubuntu-20.10-desktop-amd64.iso
201 peers for torrent kubuntu-21.04-desktop-amd64.iso
15 peers for torrent lubuntu-16.04-alternate-powerpc.iso
31 peers for torrent lubuntu-16.04.1-alternate-amd64.iso
31 peers for torrent lubuntu-16.04.1-alternate-i386.iso
50 peers for torrent lubuntu-16.04.6-desktop-i386.iso
42 peers for torrent lubuntu-18.04-alternate-amd64.iso
48 peers for torrent lubuntu-18.04-alternate-i386.iso
92 peers for torrent lubuntu-18.04.5-desktop-amd64.iso
105 peers for torrent lubuntu-20.04.2-desktop-amd64.iso
150 peers for torrent lubuntu-20.10-desktop-amd64.iso
18 peers for torrent mythbuntu-16.04.6-desktop-amd64.iso
14 peers for torrent mythbuntu-16.04.6-desktop-i386.iso
12 peers for torrent ubuntu-12.04.4-alternate-amd64+mac.iso
12 peers for torrent ubuntu-12.04.4-desktop-amd64+mac.iso
28 peers for torrent ubuntu-12.04.5-alternate-amd64.iso
23 peers for torrent ubuntu-12.04.5-alternate-i386.iso
27 peers for torrent ubuntu-12.04.5-desktop-amd64.iso
17 peers for torrent ubuntu-12.04.5-desktop-i386.iso
10 peers for torrent ubuntu-12.04.5-dvd-amd64.iso
16 peers for torrent ubuntu-12.04.5-dvd-i386.iso
48 peers for torrent ubuntu-12.04.5-server-amd64.iso
35 peers for torrent ubuntu-12.04.5-server-i386.iso
10 peers for torrent ubuntu-14.04.6-desktop-amd64+mac.iso
87 peers for torrent ubuntu-14.04.6-desktop-amd64.iso
42 peers for torrent ubuntu-14.04.6-desktop-i386.iso
70 peers for torrent ubuntu-14.04.6-server-amd64.iso
51 peers for torrent ubuntu-14.04.6-server-i386.iso
135 peers for torrent ubuntu-16.04.7-desktop-amd64.iso
116 peers for torrent ubuntu-16.04.7-server-amd64.iso
11 peers for torrent ubuntu-16.04.7-server-arm64.iso
298 peers for torrent ubuntu-18.04.5-desktop-amd64.iso
292 peers for torrent ubuntu-18.04.5-live-server-amd64.iso
34 peers for torrent ubuntu-18.04.5-server-amd64.iso
16 peers for torrent ubuntu-18.04.5-server-arm64.iso
14 peers for torrent ubuntu-18.04.5-server-ppc64el.iso
12 peers for torrent ubuntu-18.04.5-server-s390x.iso
17 peers for torrent ubuntu-20.04.1-legacy-server-arm64.iso
192 peers for torrent ubuntu-20.04.2-live-server-amd64.iso
39 peers for torrent ubuntu-20.04.2-live-server-arm64.iso
15 peers for torrent ubuntu-20.04.2-live-server-ppc64el.iso
21 peers for torrent ubuntu-20.04.2-live-server-s390x.iso
208 peers for torrent ubuntu-20.04.2.0-desktop-amd64.iso
186 peers for torrent ubuntu-20.10-desktop-amd64.iso
160 peers for torrent ubuntu-20.10-live-server-amd64.iso
23 peers for torrent ubuntu-20.10-live-server-arm64.iso
18 peers for torrent ubuntu-20.10-live-server-ppc64el.iso
13 peers for torrent ubuntu-20.10-live-server-s390x.iso
236 peers for torrent ubuntu-21.04-desktop-amd64.iso
280 peers for torrent ubuntu-21.04-live-server-amd64.iso
18 peers for torrent ubuntu-21.04-live-server-ppc64el.iso
45 peers for torrent ubuntu-budgie-18.04.5-desktop-amd64.iso
43 peers for torrent ubuntu-budgie-18.04.5-desktop-i386.iso
17 peers for torrent ubuntu-budgie-20.04.2.0-desktop-amd64.iso
99 peers for torrent ubuntu-budgie-20.10-desktop-amd64.iso
120 peers for torrent ubuntu-budgie-21.04-desktop-amd64.iso
36 peers for torrent ubuntu-gnome-16.04.5-desktop-amd64.iso
29 peers for torrent ubuntu-gnome-16.04.5-desktop-i386.iso
19 peers for torrent ubuntu-mate-16.04.6-desktop-i386.iso
86 peers for torrent ubuntu-mate-18.04.5-desktop-amd64.iso
57 peers for torrent ubuntu-mate-18.04.5-desktop-i386.iso
141 peers for torrent ubuntu-mate-20.04.2.0-desktop-amd64.iso
64 peers for torrent ubuntu-mate-20.10-desktop-amd64.iso
20 peers for torrent ubuntu-mate-21.04-desktop-amd64.iso
22 peers for torrent ubuntukylin-16.04.7-desktop-amd64.iso
30 peers for torrent ubuntukylin-18.04.5-desktop-amd64.iso
17 peers for torrent ubuntukylin-18.04.5-desktop-i386.iso
36 peers for torrent ubuntukylin-20.04.2.0-desktop-amd64.iso
32 peers for torrent ubuntukylin-20.10-desktop-amd64.iso
13 peers for torrent ubuntustudio-16.04.5-dvd-amd64.iso
14 peers for torrent ubuntustudio-16.04.5-dvd-i386.iso
35 peers for torrent ubuntustudio-18.04-dvd-amd64.iso
31 peers for torrent ubuntustudio-18.04-dvd-i386.iso
105 peers for torrent ubuntustudio-20.04.2.0-dvd-amd64.iso
87 peers for torrent ubuntustudio-20.10-dvd-amd64.iso
75 peers for torrent ubuntustudio-21.04-dvd-amd64.iso
28 peers for torrent xubuntu-16.04.6-desktop-amd64.iso
11 peers for torrent xubuntu-16.04.6-desktop-i386.iso
141 peers for torrent xubuntu-18.04.5-desktop-amd64.iso
202 peers for torrent xubuntu-18.04.5-desktop-i386.iso
194 peers for torrent xubuntu-20.04.2.0-desktop-amd64.iso
152 peers for torrent xubuntu-20.10-desktop-amd64.iso
166 peers for torrent xubuntu-21.04-desktop-amd64.iso
Fetching my own id
4780 nodes, 1529 goods, 23 bads | in: 538, out: 833 en 15s
4 peers for torrent ubuntu-20.04.1-legacy-server-amd64.iso
0 peers for torrent ubuntu-mate-16.04.6-desktop-amd64.iso
0 peers for torrent lubuntu-16.04.6-desktop-amd64.iso
0 peers for torrent lubuntu-18.04.5-desktop-i386.iso
0 peers for torrent lubuntu-21.04-desktop-amd64.iso
7 peers for torrent ubuntu-14.04.6-server-amd64+mac.iso
7 peers for torrent ubuntu-14.04.6-server-powerpc.iso
9 peers for torrent ubuntu-14.04.6-server-ppc64el.iso
0 peers for torrent ubuntukylin-21.04-desktop-amd64.iso

After running it for more than one hour, I did not experience any issues

Same after 30 hours

Hi there, unfortunately I no longer have access to the code and environment of which I triggered this bug back in 2018.

No problems,

I may have found a bug in how transactions id are handle. I think it may be related under high load.