reluce/szurubooru-toolkit

Import from URL

myalow opened this issue · 4 comments

Hi!

Big fan of the toolkit since there's not a lot out there for szurubooru importing, management etc.
One thing I would like to see, however, is URL support. Although it is nice to be able to scrape results from a query, I would like to be able to insert one or multiple (or hundreds via a url list .txt or something) links to a booru post and scrape that way.

Essentially, being able to do import-from-booru gelbooru gelbooru-urls.txt would be a dream come true, but even import-from-booru gelbooru https://gelbooru.com/index.php?page=post&s=view&id=1234567 would be great, and the end user could make a shellscript or similar to handle mass imports from there.

Hey, I get where you come from 😉 .

With a Gelbooru URL, do you mean the URL of the post? If so, you can use id:1234567 as your query and it would import that post just fine. However, you cannot use multiple ids in your query.

In that case, you can write a bash script which calls the script for each URL:

while read -r URL
do
    POST_ID=$(echo "${URL}" | sed -E "s/.*id=([0-9]*)$/\1/")
    import-from-booru gelbooru "id:${POST_ID}"
done < gelbooru-urls.txt

(I did not test the script except for the sed filtering)

I can implement the same logic within the script when I can find the time.

didn't even think of going at it with post IDs, you're a smarter man than I lol.

I think that solves my issue, I'll leave the issue open in case you want to keep it open until you plan on adding the enhancement, otherwise feel free to close. If I find the free time i might PR and try to implement myself.

Cheers!

I'm currently working on an integration with gallery-dl. You can checkout the branch gallery_dl and the script import-from. For now it supports existing Boorus (+Sankaku!), but you have to supply the URL to the post(s) instead of a search query. Or you can use --input-file and specify a file with the URLs to import.

Maybe import-from will supersede import-from-booru and import-from-twitter in the future, but we'll see.

Added in 0.8.0:

import-from-url --input-file urls.txt