shiyanhui/dht

Crawler mode speed

Zibri opened this issue · 13 comments

Zibri commented

Even increasing the connection limits I notice that in crawler mode it gets only 60 peers/minute.
Is there a setting to increase the speed?
With another crawler I have I can get 100000/hour!

Did you run it in local network? This crawler can't run behind NAT now.

Zibri commented

Is it the sample that you are running?

  • There are two kind of peers message in dht protocol, get_peers and announce_peer. get_peers messages are far more than announce_peer. Only announce_peer is what we want. The example will only print successful BT seed. I don't know what p2pspider print.
  • We got announce_peer message, and then we fetch the BT seed. If it fails, the ip:port will be put in blacklist, and DHT crawler will not fetch it until sometime in the future.
Zibri commented

OK, I'll figure it out.

ilcn commented

I have the same question as Zibri. Intrinsically golang should go faster than nodejs, and I am listening for annouce peer right now. But please let us know if what in the config we can tweek to make the spider mode go faster.
Thanks

I have rewritten p2pspider from node to golang recently. Same efficiency as before, but higher performance. @Zibri

Zibri commented

simdht for golang is here godht

I don't think so c is the way. You need to learn golang 1.9 runtime's performance.

Zibri commented
Zibri commented
Zibri commented
Zibri commented