schollz/croc

Sending a folder full of many small files take a very long time

AbdoMahfoz opened this issue · 12 comments

When sending a folder, croc sends files one by one. This is fine as long as the size of individual files is bigger than the transfer speed, but when transferring a folder full of files size of each is less than 100 kb, the transfer becomes super slow over a transfer channel of 30 Mbps peed.
I understand that using the zip option will fix this, but sometimes I find myself needing to sync a folder in my machine with a folder in my friend's machine who only has some of the files that I have in that folder. zipping will have him download everything.
If croc takes a parameter that allows the user to specify how many files should be sent in parallel, this will fix this issue. If you think that this feature can be harmful in other scenarios, make that parameter default to 1.

what are the results of your traceroute 5.78.91.237 ?

@schollz I just ran into the same issue and searched if someone already reported the issue. I often run into this when e.g. transfering git repos via croc to a server. This is the easiest way for you to reproduce: Take a large git repo (e.g. this one) and transfer it somewhere. You will see that it won't utilize your network fully.

Sorry for not replying earlier 😅

Here is my traceroute

traceroute to 5.78.91.237 (5.78.91.237), 30 hops max, 60 byte packets
 1  192.168.100.1 (192.168.100.1)  11.088 ms  10.890 ms  10.739 ms
 2  host-102.47.32.1.tedata.net (102.47.32.1)  14.478 ms  13.154 ms  14.192 ms
 3  10.29.112.93 (10.29.112.93)  15.296 ms 10.29.112.53 (10.29.112.53)  16.933 ms 10.29.112.81 (10.29.112.81)  16.691 ms
 4  10.38.83.122 (10.38.83.122)  14.724 ms 10.38.83.126 (10.38.83.126)  16.462 ms 10.38.83.122 (10.38.83.122)  16.244 ms
 5  10.39.14.13 (10.39.14.13)  16.108 ms 10.39.14.17 (10.39.14.17)  13.910 ms 10.39.14.13 (10.39.14.13)  15.759 ms
 6  * 10.39.15.161 (10.39.15.161)  20.412 ms  7.948 ms
 7  et-1-0-25.cr5-mil3.ip4.gtt.net (212.222.6.229)  58.386 ms  58.155 ms *
 8  ae5.cr5-mil2.ip4.gtt.net (89.149.138.186)  53.968 ms  53.505 ms et-1-0-25.cr5-mil3.ip4.gtt.net (212.222.6.229)  56.846 ms
 9  mno-b3-link.ip.twelve99.net (62.115.190.222)  56.481 ms  56.123 ms ae5.cr5-mil2.ip4.gtt.net (89.149.138.186)  55.763 ms
10  prs-bb1-link.ip.twelve99.net (62.115.135.224)  67.270 ms mno-b3-link.ip.twelve99.net (62.115.190.222)  55.097 ms prs-bb1-link.ip.twelve99.net (62.115.135.224)  66.645 ms
11  ldn-bb1-link.ip.twelve99.net (62.115.135.24)  71.473 ms prs-bb1-link.ip.twelve99.net (62.115.135.224)  65.982 ms *
12  ldn-bb1-link.ip.twelve99.net (62.115.135.24)  76.514 ms *  72.981 ms
13  * * *
14  * ash-b2-link.ip.twelve99.net (62.115.123.125)  227.087 ms *
15  chi-b23-link.ip.twelve99.net (62.115.126.45)  212.353 ms * *
16  * * *
17  den-bb2-link.ip.twelve99.net (62.115.137.114)  179.269 ms * *
18  * den-bb2-link.ip.twelve99.net (62.115.137.114)  176.843 ms *
19  den-bb1-link.ip.twelve99.net (62.115.127.66)  176.605 ms *  178.544 ms
20  dls-b23-link.ip.twelve99.net (62.115.136.119)  240.094 ms * *
21  * den-b3-link.ip.twelve99.net (62.115.137.153)  225.316 ms palo-b24-link.ip.twelve99.net (62.115.139.107)  212.393 ms
22  port-b3-link.ip.twelve99.net (62.115.115.25)  215.635 ms den-bb1-link.ip.twelve99.net (62.115.138.68)  191.057 ms *
23  port-b3-link.ip.twelve99.net (62.115.115.25)  214.059 ms * palo-b24-link.ip.twelve99.net (62.115.139.107)  221.332 ms
24  palo-b24-link.ip.twelve99.net (62.115.139.107)  222.920 ms port-b5-link.ip.twelve99.net (62.115.112.45)  213.433 ms hetzner-ic-375332.ip.twelve99-cust.net (80.239.132.89)  218.905 ms
25  spine2.cloud1.hil.hetzner.com (5.78.0.86)  211.929 ms port-b5-link.ip.twelve99.net (62.115.112.45)  221.450 ms spine1.cloud1.hil.hetzner.com (5.78.0.82)  216.875 ms
26  hetzner-ic-375331.ip.twelve99-cust.net (80.239.132.87)  226.268 ms port-b5-link.ip.twelve99.net (62.115.112.45)  224.315 ms spine1.cloud1.hil.hetzner.com (5.78.0.82)  222.422 ms
27  21928.your-cloud.host (5.78.10.35)  212.407 ms spine1.cloud1.hil.hetzner.com (5.78.0.74)  240.575 ms *
28  * port-b3-link.ip.twelve99.net (62.115.115.25)  240.993 ms spine2.cloud1.hil.hetzner.com (5.78.0.86)  224.236 ms
29  * * *
30  * * 21928.your-cloud.host (5.78.10.35)  227.699 ms

@lukas-mertens maybe the --zip option?

@schollz Yes, that should solve it. I don't know if this is out of scope, but a cool way to increase UX of this would be to detect if there are many small files to transfer and then interactively ask the user "Detected many small files. Do you want to enable "--zip true" to speed up transfer (Y/n)?"

I didn't know this option exists

The problem with the zip option is:

  1. It requires double the space, which is may not be viable when copying large games with lots of small files.
  2. You lose the hash functionality that allows you to only transfer files that are different.

Please consider sending multiple files at the same time, as an optional parameter set by default to 1

@AbdoMahfoz would welcome pr, thanks

On it 🫡

Now that this is a thing, should we reopen this issue?

@AbdoMahfoz sounds good, looking forward to the pr