reluce/szurubooru-toolkit

Image/Animation type restriction doesn't seem to be working

shibaobun opened this issue · 8 comments

The type:image,animation restriction doesn't seem to be working, which results in the list of returned posts to include mp4s and other media, which can cause errors when trying to resize what should only be static images, etc.

I've already fixed this issue on my docker branch here, namely this commit.

However, my docker branch also includes a lot of other minor fixes/changes including:

  • Catching the image resizing if it errors (just continues with original image)
  • Continuing on a deepbooru tagging failure instead of failing
  • Finally, dockerizing into a container with internal cronjobs so it can be run at regular intervals
    • Not implemented yet but I've done it before, I could pretty easily set up a github action to build and deploy new builds to docker hub if you'd like!

Would you be interested in any of these changes? I could cherrypick any of the changes out and open a pr if you'd like! Thank you!

Hey, thanks for the contributions. You can create a pr for the fixes and one for the Docker related stuff.

Do you intend to run the scripts in the container only with cron? I reckon that running the scripts with docker container exec would be a good addition (like docker container exec -t szurubooru-toolkit import-from [...] ).

I'll get started on those PRs then!
For the docker container, you can run one-off commands like docker exec -it container_name poetry run import-from [...] but for my use case (and for others I figured) would want the commands to be run on a regular basis for new files without having to spin up a k8s cluster or something.
For people interested in one-off commands, I figured they could leave the crontab blank and either run their commands with docker exec, or replace the command option in docker-compose

Yes, using cron does of course make a lot of sense. I have some other points regarding the Docker image, but I'll add those comments in the PR.

The change in #1c02aea unfortunately doesn't seem to fix the issue. I've fiddled around with this issue before, but couldn't get it working.

When specifying the query with type:image,animation in the web UI it does work as intended, but the API returns video files nonetheless. Did video files get returned in your testings? Maybe it's a bug with the szurubooru API.

I was getting video files returned in the query that would fail when trying to get resized and also couldn't be parsed by deepbooru, which might explain some of the fixes that I added in that #23 😅
However, at least pointing to my instance which is hosted externally, adding it into the query fixed it. Do you have your instance running locally which then works off the API in that case perhaps? I haven't looked too much at the code but I can look into it more maybe

Yes, my instance is running locally. But that doesn't make a difference, as /api/posts/?query= is queried in both cases.

I just tested it with a smaller result set (around 20 results) and it worked as expected. When I use a larger query (around 8000 results), mp4s/webms start to randomly pop up.
After checking with Postman and the offset where those posts should appear, no video files are getting returned. I'm kind of lost on this. I'll check again with VScode in a Debug Session when I get around it, maybe I find something there 🤔 .

I found the error. When more than 100 results are retrieved, I build a new query URL which includes the offset. In that query, I didn't include the type:image,animation. See szurubooru.py:93.

After updating the query and running a test across three pages, no video files are getting returned. The fix is pushed to main.

Thank you for your work on this project!! This is great news 😄