CondeNast/atjson

Unicode whitespace is stripped from leading and trailing positions in markdown paragraphs

Opened this issue Β· 0 comments

Following up on #307, we should handle a longer list of unicode whitespace characters:

Name Code Point Entity Size
No Break Space \u00A0   πŸ‘‰ πŸ‘ˆ
En Quad \u2000   πŸ‘‰β€€πŸ‘ˆ
Em Quad \u2001   πŸ‘‰β€πŸ‘ˆ
En Space \u2002   πŸ‘‰β€‚πŸ‘ˆ
Em Space \u2003   πŸ‘‰β€ƒπŸ‘ˆ
Thick Space \u2004   πŸ‘‰β€„πŸ‘ˆ
Mid Space \u2005   πŸ‘‰β€…πŸ‘ˆ
Six-per-em Space \u2006   πŸ‘‰β€†πŸ‘ˆ
Figure Space \u2007   πŸ‘‰β€‡πŸ‘ˆ
Punctuation Space \u2008   πŸ‘‰β€ˆπŸ‘ˆ
Thin Space \u2009   πŸ‘‰β€‰πŸ‘ˆ
Hair Space \u200A   πŸ‘‰β€ŠπŸ‘ˆ
Zero Width Space \u200B ​ πŸ‘‰β€‹πŸ‘ˆ
Narrow No-break Space \u202F   πŸ‘‰β€―πŸ‘ˆ
Medium Mathematical Space \u205F   πŸ‘‰βŸπŸ‘ˆ
Ideographic Space \u3000   πŸ‘‰ πŸ‘ˆ
Zero Width No-break Space \uFEFF  πŸ‘‰ο»ΏπŸ‘ˆ

I think this is a fairly exhaustive list of spaces, but if any more should be added, please comment πŸ˜„