Posting names with weird characters
Opened this issue · 9 comments
/u/-----
This guy..... and his name..... it hurts. And it hurts the bot too.....
HFYREBORN BOT has already corrected this issue.... We are behind the times.
I saw the story this happened in a couple of days ago, and it's looking to me like a simple escape character issue. Assuming the author's name went into the database correctly, the solution to this particular issue will be at or around line 50 of submissions.py and line 95 of bot.py. This is an issue related to Reddit markdown,
Assuming that the name variables are strings, this specific case could be fixed with a simple replace command, such as the following:
post = config.POST_CONTENT.format(username=submission.author.name.replace("_", "\_"))
Now, obviously, that doesn't cover the rest of the potential formatting characters, but they could fairly easily end up being covered if the author's name gets run through a sanitation function designed to escape any relevant markdown formatting characters.
errrmmm... can someone put this guys name in a code block or something? i think github's markdown is also mangling it.
That being said, it doesn't look like HFYBotReborn is doing special with the name output. The author link appears to just be "[" + username + "](https://reddit.com/u/" + username + ")";
__-___----_
As for HFYBotReborn...that's the weird part. I checked yesterday to see what they did, and realized it doesn't even have a relevant commit. I just double checked on Reddit, and that code with a raw, unsanitized username /will/ still use formatting, meaning it doesn't display properly. I know the username needs to be sanitized, but the frustrating thing, for me, is that I can't find where it gets done in HFYBotReborn, nor have I found where it might be done by RedditSharp, which HFYBotReborn relies on to post and comment.
As an aside, love your avatar. Used to follow lfgcomic for ages, not sure why I stopped.
Anyway, I just double-checked and found this page detailing reddit username rules. It would appear the only non-alphanumeric characters in usernames are, coincidentally, the underscores and dashes used in the aforementioned username. Considering only two relevant lines, the fix is literally a matter of changing line 50 of submissions.py to
post = config.POST_CONTENT.format(username=submission.author.name.replace("_", "\_").replace("-", "\-"))
and line 95 of bot.py to
return config.POST_CONTENT.format(username = author.replace("_", "\_").replace("-", "\-"))
Of course, assuming that the author name is a string, which would be the normal datatype for this.
Sorry for the notification spam, but I figured out where the relevant commit is. Turns out it's not in the main repository yet, and is instead over in j1xwnbsr's copy. And they did exactly what I was suggesting, though a bit overkill. Three of the characters being escaped (the tilde, asterisk, and caret) are supposedly not usable in Reddit usernames, while the underscore and dash are. So, it's looking like my suggestions above should work just fine.
Bot.py is an artifact of old. {magical hand waviness} Ignore the artifact {End magical hand waviness}
I've merged the submissions. @ around line 50 and again around line 77.
I'll work it into a pull after a bit.
So, it looks like we may need to add two more backslashes every time to the how to subscribe message, to make sure the bot reads it accurately. Is that correct?
It has been suggested to place it in a Code block too. `/USERNAME`
I'm not sure which way would be the best way.
Code block would be easier for output formatting, but would look a bit off and would require a bit of additional filtering on reading subscription/unsubscription requests, Adding the additional backslashes would not require the additional filtering, but would require adding another argument or filter for creating the posts. They're roughly equal in terms of coding work, but the additional backslashes is less work to maintain, less work for the users of the subreddit, and less work for the bot to do when reading posts.