Earlopain/FoxTrove

Improve readme

Closed this issue · 7 comments

Hey there,

I'm really curious what this project is, but the readme has no information, and browsing through the code gives me only half an idea.

So, what is this project, if I may ask?
My immediate guess is maybe a reverse image lookup system? Possibly?

Hey there, thanks for stalking me :D
It's something like saucenow, but with thight integration for e6. It basically allows you to input a bunch of artist urls, then it goes on to download them, and looks the images up on e6 wether they are already uploaded or not. Heres a view of searching for bvas for kenket
image. You can also search for not already uploaded images.
One thing that has always bugged me about saucenao it that they don't support a bunch of sites I would like them to.

This can currently scrape:

  • Twitter
  • Inkbunny
  • Deviantart
  • Artstation
  • Reddit
  • Furaffinity
  • Weasyl
  • Newgrounds

Usability is still pretty bad, it's mostly just buttons that I know what they do. It's also missing account features, should I ever plan to set this up somewhere publicaly accessible.

The biggest problem right now is speed, since e6 only allows 1 iqdb request every 2 seconds which means you need to potentially wait hours until just a single artist has finished processing. I'm trying to solve that by updating the iqdb version e6 uses, since they now use a simple sqlite database which can potentially be dumped.

This has no affiliation with e6, it's just something I do because I wanted it. I will write up a proper readme and add a license sometime soon.

Ooooh, I see! That is quite an interesting tool!

Yeah, doesn't seem like usability needs to be great, it doesn't need to be mass market or anything I imagine. But yeah, that iqdb speed is not handy..

I wonder if a quick pass could be done by checking whether images with matching source URLs exist?
I've actually been thinking lately, that it would be neat to use one of the multi-furrysite reverse image lookup things to check whether source URLs on e621 are fully up to date, there's a few things which exist on e621 and FA or wherever, but only link the artist profile on e621, rather than the submission link.

Apologies for poking my snout in, I'm just curious about all the furry sites and tools around here on github :3

No need to apologise, I was just joking.

Looking at source urls might be possible, but there are also quite a few instances where people just add alternate sources without realizing that it was actually a superior image. Artists also sometimes update the submission on sites like FA where that's possible, to correct mistakes or just upload higher res files because they forgot to do so initially.
Using saucenao or others to add missing links sounds like a good idea, the results are acurate most of the time, though it has problems with sketches (since they are all mostly white) and a few other cases where the match isn't correct, so you probably won't be able to fully automate it.

Ahh, yeah, but maybe using sources could be a first-pass at the matching process, rather than querying e621's iqdb for each image? Would still need to check filesizes for quality, but could speed up that 1 request every 2 seconds step.

And yeah, I'm familiar with the issues with sketches and stuff. I've done a bunch of work with the python imagehash library, and sometimes the errors are quite bad, especially with sketches. I doubt it could be fully automated, but could probably build something that runs through and says "These images look the same, but don't have source links, are they the same?" and gives a user a yes/no for each, adding those source tags as they go.
But that's a separate project idea really! I might poke around at that

Ahh, yeah, but maybe using sources could be a first-pass at the matching process, rather than querying e621's iqdb for each image? Would still need to check filesizes for quality, but could speed up that 1 request every 2 seconds step.

I'm just going to leave it as is for now, and wait for e621ng/e621ng#368 to get merged (hopefully). There are a few things I had to change in iqdb which I talked about with Kira. At first glance it looks like the numbers are just straight up better, but you never know until you've got a decent enough sample size. It works good enough for me right now, I didn't even think someone would notice it until I myself made a forum post or something.

That's fair enough! I didn't even know you could query e621 iqdb in that way to be honest!

I threw a few things into the readme. I can of course still add more if you have any questions remaining, but it's at least better than before. Writing isn't my strong suite, please tell if I forgot anything.