sherlock-project/sherlock

Genius.com returns false positive when dot is used

urlan opened this issue · 8 comments

Checklist

  • [ X ] I'm reporting a website that is returning false positive results
  • [ X ] I've checked for similar site support requests including closed ones
  • [ X ] I've checked for pull requests attempting to fix this false positive
  • I'm only reporting one site (create a separate issue for each site)

Description

Example: python sherlock julia.cat

Username julia.cat on Genius.com does not return Genius.com/julia.cat, but Genius.com/julia. It removes all letters after dot.

I tested this with latest and it immediately seems to work. What version are you running?
image

I'm using lattest version. But I refer to access website.

Did you try to access https://genius.com/artists/julia.cat? Try it and take a look at URL.

I see the issue. If the username is x.y, the genius url will return as https://genius.com/artists/x.y when the real url should be https://genius.com/artists/xy since genius just removes all dots from the username in the url

Thanks for opening this issue and reporting this problem, @urlan :)

I checked quickly, and it seems that other sites also have the same policy of removing the dot and strings after the dot in the username, not just the Genius site.

I will look into this more carefully soon, and we will make an update to fix the problem if possible.

De nada, Matheus. Eu que agradeço pela ferramenta. Abração.

@matheusfelipeog

Huh. Just ran a few tests and you're right. Didn't even realize that was happening.

Archive of Our Own
Eintracht Frankfurt Forum
Gumroad
HackerRank
OpenStreetMap
Pinkbike
Splits[.]io
Strava
eintracht

I probably missed a couple but those were my incorrect hits with blue.man. I've addressed these ones with a regex check ^[^.]*?$ in #2068 as well

Thanks for identifying other sites that follow the same policy, @ppfeister.

This isn't related to this issue, but:

I noticed that you're quite engaged and have been contributing a lot to the project in recent days. Thank you very much for that. I will review all your contributions. It might take a bit of time (I've been quite busy lately), but rest assured we will do it as soon as possible. Again, thank you.

@matheusfelipeog No worries! And really, no rush either.
Honestly wouldn't have even pinged if not for replying to the above. Not tryna be a "why hasn't my readme typo fix been approved yet!" person.

If you stumble upon any that were missed but can't address, feel free to drop a ping. In the meantime I'll probably add a few more while it waits for review.

Considering opening a PR to address 429s en masse and inverse validation (rather than error-based only), but those would probably come after the current ones. Possible merge conflicts if done before, depending on method. That would be a larger change as well so obviously subject to further review.