Any text with \r (including with an escaped \) is removed by parser
Closed this issue ยท 5 comments
Expected Behavior
In a text document use of an escaped backslash before an r character should not remove the r character. This is necessary for eg. latex, where commands like \right) are common.
Actual Behavior
All \r are removed from the html document, even if the raw string is formatted as "\\r".
This seems to have been caused by the fix to #864
Steps to Reproduce
Create a html document with \\r inside the html string, run it through the html parser.
Reproducible Demo
https://stackblitz.com/edit/html-react-parser-typescript-w9j4u9vu?file=src%2Findex.tsx
Environment
- Version: 5.2.0
- Browser: Chrome
- OS: Macos
Keywords
I have just encountered the same issue which has broken equations containing \right. The solution at the moment is to rollback to version 5.1.18.
๐ Nicely explained and reproduced @QuinnStraus.
Thanks for opening this issue! It's related to remarkablemark/html-dom-parser#902.
I wonder what's the best way to fix this without adding more complexity. Do you think introducing another option makes sense? E.g.:
parse(html, {
escapeCarriageReturn: true, // defaults to false
});I don't think there is necessarily a conflict here, since in the raw string the latex \right will have an escaped backslash (\\right), so given that the original issue was about the character \r it should be fine.
After looking at the pull request it seems like it
- replaces all \r with \\r
- runs the DOM parsing (which presumably removes all \r)
- Replaces all \\r with \r again.
However the final step also replaces all \\r in the original document with the carriage return \r, which leads to the issue.
I think a fix would be to replace \r with a string that is not likely to be used, such as __CAR_RETURN_(random symbols)__ so performing the reverse replacement does not replace anything else.
Thanks for your help on this issue! Can you verify that the bug has been fixed in:
Does seem to be working now! Thanks for your quick fix!