Breaking changes in 1.1

Question

Breaking changes in 1.1

revin opened this issue 10 years ago · 5 comments

Hello! Greenkeeper just notified me that github-slugger 1.1 broke npm/marky-markdown; the issue is that in 1.0, unicode emoji characters in headings were being stripped out, but now they're being converted to HTML entities.

For example, given ## 😄-😄 unicode hyphen unicode:

1.0 rendered --unicode-hyphen-unicode
1.1 now renders 😄-😄-unicode-hyphen-unicode

...which has broken a handful of our tests. Is there a way to get the old behavior? Thanks! 👍

Answer 1 · 2016-04-26T17:15:32.000Z

Thanks @revin!

The introduced changes were trying to fix a bug in github-slugger, which can be seen is this heading as slugged by GitHub: https://github.com/wooorm/gh-and-npm-slug-generation#Привет-non-latin-你好, unfortunately, it seems to be too loose.

Now I’m wondering what exact characters are allowed by GH in slugs; white space and punctuation, and emoji? Thoughts?

Answer 2 · 2016-04-26T17:28:07.000Z

Apologies @revin! If we need to revert and/or redo this I'm fine with it!

Answer 3 · 2016-04-26T17:34:50.000Z

GitHub says they use vmg/redcarpet (which is mostly C code) to do the markdown rendering. So the answers should be there, or at least that's the first place I plan on looking when I get a chance.

Answer 4 · 2016-04-26T17:49:58.000Z

There’s also github/markup, but I couldn’t find any reference to the jargon word “slug” in those repos.

I’ve included the input/output from wooorm/gh-and-npm-slug-generation and chrisdickinson/emoji-slug-example in the tests locally, and the only difference from the current algorithm seem to be the emoji. I’ll look into integrating mathiasbynens/emoji-regex, and if that fixes things.

Answer 5 · 2016-04-26T17:54:47.000Z

That might do it. I was reading through https://github.com/vmg/redcarpet/blob/master/ext/redcarpet/html.c#L273-L322 and now have a bunch of tabs open to see what the unicode support situation is on C's standard library functions. 😕