[BUG] '&' character breaks parsing
Closed this issue · 4 comments
Hello!
Whenever my HTML text has an '&' in it, the behaviour of this library acts very strange. Here is a pretty basic example:
render() {
const isValidNode = () => true;
const html = "<h3>Test&</h3>";
const instructions = [
{
shouldProcessNode: node => true,
processNode: (node, children ,index) => {
console.log(node);
return this.processNodeDefinitions.processDefaultNode(node, children, index)
}
}
];
return (
<div >
{this.parser.parseWithInstructions(html, isValidNode, instructions)}
</div>
);
}
The output of this component then looks like this:
The console log in processNode
gets called twice, and the output is:
Notice that the last 't' in "Test" gets duplicated as well.
If I change the html string to
html = "<h3> Test& </h3>";
Then the output doesn't have the stray h3, but the last "t" is still duplicated.
My versions are:
{
"react-dom": "^16.6.3",
"react": "^16.6.3",
"html-to-react": "^1.3.4",
}
Any help is appreciated. Thank you!
For some reason I am unable to reproduce this anywhere else except my project... I found what the issue is, but i'm unsure why it only happens in my project.
The issue is in the tokenizer (which is part of a different project). When it encounters the &
, it gets put into a state where it's expecting an encoded html entity (which doesnt exist) and then the rest of the logic is broken. Again, not sure why this only happens for me.
Ill close this issue for now.
Note - setting this line to false fixes the issue.
Sorry for spam - The issue is some weird dependency conflict between this package and some other packages in my project. The latest version of htmlparser2
is 3.10.0
(which your package requires) but some other packages in my project require older versions of htmlparser2
, and for some reason the older versions are getting chosen.
Forcing all packages to use 3.10.0 fixes the issue.