chapmanjacobd/library

lb dl downloads more than it should

chapmanjacobd opened this issue · 2 comments

Suppose I add the following playlist https://www.youtube.com/playlist?list=PLqxP5EuGxPnfmg5P0_96bz9E__-XrLuFc by doing :

lb tubeadd test.db https://www.youtube.com/playlist?list=PLqxP5EuGxPnfmg5P0_96bz9E__-XrLuFc -vv

How can I download a single video using its relevant url from all those added to the database by lb tubeadd?

Normally, I'd expect the following command to do the trick:

lb dl test.db --video https://www.youtube.com/watch?v=MhlFR2wWqHA -vv

Instead of downloading only this video, lb dl downloads them all, which makes progress polling problematic and leads to a race condition.

Here is how I debugged this:

lb dl test.db --video https://www.youtube.com/watch?v=MhlFR2wWqHA -vvvv -pa

...

{'profile': 'video', 'extractor_config': {}, 'prefix': '/home/xk/github/xk/lb', 
 'print': 'a', 'sort': 'm.path like "http%"\n        , m.title IS NOT NULL desc\n        , m.path\n        , random()', 
'where': [], 'include': [], 'exclude': [], 'retry_delay': '14 days', 
'db': <Database <sqlite3.Connection object at 0x7fbacce01e40>>, 'verbose': 4, 'database': '/tmp/tmp.ehhyY7Lx7h/test.db', 
'playlists': ['https://www.youtube.com/watch?v=MhlFR2wWqHA'], 'defaults': ['sort', 'limit'], 'unk': [], 'action': 'download'}

...

path         count
---------  -------
Aggregate       15

We can see that the URL is being parsed into args.playlists. And the total number of downloads it has queued up is 15.

lb dl test.db --video -s https://www.youtube.com/watch?v=MhlFR2wWqHA -vvvv -pa

...

{'profile': 'video', 'extractor_config': {}, 'prefix': '/home/xk/github/xk/lb', 
'print': 'a', 'sort': 'm.path like "http%"\n        , m.title IS NOT NULL desc\n        , m.path\n        , random()', 
'where': [], 'include': ['https://www.youtube.com/watch?v=MhlFR2wWqHA'], 'exclude': [], 'retry_delay': '14 days', 
'db': <Database <sqlite3.Connection object at 0x7f2a8246de40>>, 'verbose': 4, 'database': '/tmp/tmp.ehhyY7Lx7h/test.db', 'playlists': [], 'defaults': ['sort', 'limit'], 'unk': [], 'action': 'download'}

...

path         count
---------  -------
Aggregate        1

Now it is using args.include and queuing up only a single video which is what was desired.

According to the documentation (lb dl -h) args.playlists should be used to limit lb dl to specific playlists but it looks like that functionality was never added so it isn't actually doing anything. But to filter by video URL --include or --search (-s) is the correct way and will be required.

After this bug is fixed the behavior of downloading everything when supplying a video URL should not happen any more but instead it should print "No media found" to stderr and exit with a nonzero code

To download a specific playlist(s) it's possible to do this now:

library fs video.db --to-json --playlists https://www.youtube.com/c/BlenderFoundation/videos | library download --video video.db --from-json -

Note that the - is required in the download command to read from stdin.

The existing functionality should be preserved but now you should be able to download videos without running tube-add (but only for single-video or the first video of a playlist--if you want the full playlist use tube-add first):

library download --video video.db $URL

Or line-delimited files (similar to yt-dlp -a)

library download --video video.db --file file_with_urls1.txt file_with_urls2.txt