Import From Booru only downloads first 100 posts

Question

Import From Booru only downloads first 100 posts

Kalidibus opened this issue a year ago · 7 comments

Kalidibus commented a year ago

Downloading from any tag with more than 100 posts won't go above 100, even when using --limit setting.

--limit works for setting it to a lower number, but not above 100.

Answer 1 · 2023-07-29T02:46:15.000Z

Kalidibus commented a year ago

Answer 2 · 2023-08-06T14:34:08.000Z

I pushed a fix to the attached branch of this issue. Can you test it with that?
When I execute the script, at least I get the correct number of posts, but I haven't run it through.

Answer 3 · 2023-08-13T13:44:43.000Z

Just tested it for myself and it worked. Fix is included in main and a patch release will be followed shortly.

Answer 4 · 2023-08-13T23:27:51.000Z

Very sorry I didn't get a chance to test this out earlier.

This is the behaviour I'm noticing on 0.9.2 when running "import-from-booru":

Gelbooru

Running without the --limit argument (eg import-from-booru gelbooru "masso") results in the error message below
Running with the --limit argument with a value of 100 or less (eg import-from-booru gelbooru --limit 99 "masso") results in the same error message.
Running with a --limit argument with a value of 100 or greater (eg import-from-booru gelbooru --limit 150 "masso") appears to behave normally at first, but will always return 0 posts found.

Error:
szurubooru-toolkit# import-from-booru gelbooru "masso" [INFO] [13.08.2023, 23:22:22 UTC]: Retrieving posts from gelbooru with query "masso"... [ERROR] [13.08.2023, 23:22:22 UTC] [import-from-booru.<module>]: An error has been caught in function '<module>', process 'MainProcess' (106), thread 'MainThread' (139922118998912): Traceback (most recent call last): File "/usr/local/bin/import-from-booru", line 6, in <module> sys.exit(main()) File "/szurubooru-toolkit/src/szurubooru_toolkit/scripts/import_from_booru.py", line 150, in main total_posts = next(posts) File "/szurubooru-toolkit/src/szurubooru_toolkit/utils.py", line 432, in get_posts_from_booru results = sync(booru.client.search_posts(tags=query.split()), limit=limit) File "/usr/local/lib/python3.11/functools.py", line 909, in wrapper return dispatch(args[0].__class__)(*args, **kw) TypeError: sync_co() got an unexpected keyword argument 'limit' Exception ignored in: <coroutine object Gelbooru.search_posts at 0x7f4225be3ef0> Traceback (most recent call last): File "/usr/local/lib/python3.11/warnings.py", line 537, in _warn_unawaited_coroutine warn(msg, category=RuntimeWarning, stacklevel=2, source=coro) File "/usr/local/lib/python3.11/warnings.py", line 109, in _showwarnmsg sw(msg.message, msg.category, msg.filename, msg.lineno, File "/szurubooru-toolkit/src/szurubooru_toolkit/utils.py", line 43, in ignore_decompression_bomb_warning return warnings.defaultaction(message, category, filename, lineno, file, line) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: 'str' object is not callable

Danbooru

Does not generate the errors above, and works well, but the amount of posts pulled is always different from the value of the --limit argument

Answer 5 · 2023-08-13T23:41:52.000Z

Some more strange behaviour:

Answer 6 · 2023-08-14T20:08:54.000Z

You're right, there was still some work to do. I pushed another quick to main where Danbooru (and hopefully other boorus) work just fine.

However, I've noticed that the Gelbooru module I'm using is only retrieving a maximum of 20 results instead of the usual 100. Using a limit of 100 gets ignored, using the API URL in the browser does work however. I also cannot get any results anymore, so I got probably rate limited for today.

There is still some old code in import-from-booru which probably needs rewriting, but as the import-from-url script exists, I'm kinda lazy to do any major reworks.

You can always use the import-from-url script, but yeah, the syntax is not as clean as the import-from-booru script: import-from-url [--range "1-20"] https://gelbooru.com/index.php?page=post&s=list&tags=masso

Answer 7 · 2023-12-08T02:27:25.000Z

I just tried the import-from-url script work around and yeah that works great thanks!