scdl-org/scdl

add multithreading

Opened this issue · 2 comments

When downloading all likes or a playlist, I think the program would benefit from a multithreaded approach. Something like adding an option --num-threads n that would partition the list in nths:

main thread is 0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 ^ 1 ^ 1 ^ 1 ^ 1 ^ 1  ^  1  ^  1  ^  <-- 2 threads
  2   2   2   2   2    2     2     2

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 ^ ^ ^ 1 ^ ^ ^ 1 ^ ^  ^  1  ^  ^  ^  <-- 4 threads
  2 | |   2 | |   2 |  |     2  |  |
    3 |     3 |     3  |        3  |
      4       4        4           4

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 ^ ^ ^ ^ ^ ^ ^ 1 ^ ^  ^  ^  ^  ^  ^  <-- 8 threads
  2 | | | | | |   2 |  |  |  |  |  |
    3 | | | | |     3  |  |  |  |  |
      4 | | | |        4  |  |  |  |
        5 | | |           5  |  |  |
          6 | |              6  |  |
            7 |                 7  |
              8                    8

Of course, this would create a little bit of a data race with the already-downloaded-file, but that could be solved by creating n tempfiles that are later appended to the already-downloaded file (duplicates in playlists are not allowed on soundcloud:
image proof
)

I'm not sure if the files are large enough to really make a difference, but another idea along those same lines is HTTP range requests - if the server supports RFC 7233 (partial content), split the file into chunks, then start downloading and merging the chunks into the destination file concurrently.

an examplre of this is here https://github.com/melbahja/got

the thing is about parallelly downloading multiple files from a playlist concurrently, not chunk up files