/html-to-react

Primary LanguageTypeScriptGNU Affero General Public License v3.0AGPL-3.0

HTML to React

This is a library that renders HTML strings into React components without using dangerouslySetInnerHTML. Converts standard HTML elements, attributes and inline styles into their React equivalents and provides a simple way to modify and replace the content.

This library is a hard fork of https://github.com/peternewnham/react-html-parser. It has some improvements and is converted to typescript.

npm Downloads

Install

npm install @hedgedoc/html-to-react
# or
yarn add @hedgedoc/html-to-react

Usage

import React from 'react';
import convertHtmlToReact from '@hedgedoc/html-to-react';

class HtmlComponent extends React.Component {
  render() {
    const html = '<div>Example HTML string</div>';
    return <div>{ ReactHtmlParser(html) }</div>;
  }
}

Security

It is important to understand that this library should not be used as a direct replacement for using properly sanitized HTML and that it only provides the same level of protection that React does which does not provide 100% protection. All HTML should be properly sanitized using a dedicated sanitisation library (such as dompurify for node/js) before being passed to this library to ensure that you are fully protected from malicious injections.

What doesn't React protect me from?

Whilst React has a certain level of protection to injection attacks built into it, it doesn't cover everything, for example:

  • xss via iframe src: <iframe src="javascript:alert('xss')" />
  • xss via link href: <a href="javascript:alert('xss')">click me</a>

Click here to see these in action and how to protect yourself using dompurify in the browser.

Why doesn't ReactHTMLParser protect me automatically?

Including a sanitizer as part of the library means it is making decisions for you that may not be correct. It is up to you to decide what level of sanitization you need and to act accordingly. Some users may already be sanitizing on the server or others may have specialized requirements that cannot be covered by a generic implementation.

Additionally, HTML sanitization is a hard thing to get right and even the most popular and actively developed sanitizers have vulnerabilities discovered from time to time. By leaving the sanitization outside of this library it gives users the ability to patch and deploy any fixes needed immediately instead of having to wait for a new version of this library to be released with the fix.

API

function convertHtmlToReact(html, [options])

Takes an HTML string and returns equivalent React elements

Usage

import convertHtmlToReact from '@hedgedoc/html-to-react';

Arguments

  • html: The HTML string to parse
  • options: Options object
    • decodeEntities=true (boolean): Whether to decode html entities (defaults to true)
    • transform (function): Transform function that is applied to every node
    • preprocessNodes (function): Pre-process the nodes generated by htmlparser2

Transform Function

The transform function will be called for every node that is parsed by the library.

function transform(node, index)

Arguments
  • node: The node being parsed. This is the htmlparser2 node object. Full details can be found on their project page but important properties are:
    • type (string): The type of node (tag, text, style etc)
    • name (string): The name of the node
    • children (array): Array of children nodes
    • next (node): The node's next sibling
    • prev (node): The node's previous sibling
    • parent (node): The node's parent
    • data (string): The text content, if the type is text
  • index (number): The index of the node in relation to it's parent

Return Types

return null Returning null will prevent the node and all of it's children from being rendered.

function transform(node) {
  // do not render any <span> tags
  if (node.type === 'tag' && node.name === 'span') {
    return null;
  }
}

return undefined If the function does not return anything, or returns undefined, then the default behaviour will occur and the parser will continue was usual.

return React element React elements can be returned directly

import React from 'react';
function transform(node) {
  if (node.type === 'tag' && node.name === 'b') {
    return <div>This was a bold tag</div>;
  }
}

preprocessNodes Function

Allows pre-processing the nodes generated from the html by htmlparser2 before being passed to the library and converted to React elements.

function preprocessNodes(nodes)

Arguments
  • nodes: The entire node tree generated by htmlparser2.
Return type

The preprocessNodes function should return a valid htmlparser2 node tree.

function convertNodeToElement(node, index, transform)

Processes a node and returns the React element to be rendered. This function can be used in conjunction with the previously described transform function to continue to process a node after modifying it.

Usage

import { convertNodeToElement } from '@hedgedoc/html-to-react';

Arguments

  • node: The node to process
  • index (number): The index of the node in relation to it's parent
  • transform: The transform function as described above
import { convertNodeToElement } from '@hedgedoc/html-to-react';
function transform(node, index) {
  // convert <ul> to <ol>
  if (node.type === 'tag' && node.name === 'ul') {
    node.name = 'ol';
    return convertNodeToElement(node, index, transform);
  }
}