/to-markdown

An HTML to Markdown converter written in JavaScript

Primary LanguageJavaScriptMIT LicenseMIT

to-markdown

An HTML to Markdown converter written in JavaScript.

The API is as follows:

toMarkdown(stringOfHTML, options);

Note to-markdown v2+ runs on Node 4+. For a version compatible with Node 0.10 - 0.12, please use to-markdown v1.x.

Installation

Browser

Download the compiled script located at dist/to-markdown.js.

<script src="PATH/TO/to-markdown.js"></script>
<script>toMarkdown('<h1>Hello world!</h1>')</script>

Or with Bower:

$ bower install to-markdown
<script src="PATH/TO/bower_components/to-markdown/dist/to-markdown.js"></script>
<script>toMarkdown('<h1>Hello world!</h1>')</script>

Node.js

Install the to-markdown module:

$ npm install to-markdown

Then you can use it like below:

var toMarkdown = require('to-markdown');
toMarkdown('<h1>Hello world!</h1>');

(Note it is no longer necessary to call .toMarkdown on the required module as of v1.)

Options

converters (array)

to-markdown can be extended by passing in an array of converters to the options object:

toMarkdown(stringOfHTML, { converters: [converter1, converter2, ] });

A converter object consists of a filter, and a replacement. This example from the source replaces code elements:

{
  filter: 'code',
  replacement: function(content) {
    return '`' + content + '`';
  }
}

filter (string|array|function)

The filter property determines whether or not an element should be replaced. DOM nodes can be selected simply by filtering by tag name, with strings or with arrays of strings:

  • filter: 'p' will select p elements
  • filter: ['em', 'i'] will select em or i elements

Alternatively, the filter can be a function that returns a boolean depending on whether a given node should be replaced. The function is passed a DOM node as its only argument. For example, the following will match any span element with an italic font style:

filter: function (node) {
  return node.nodeName === 'SPAN' && /italic/i.test(node.style.fontStyle);
}

replacement (function)

The replacement function determines how an element should be converted. It should return the markdown string for a given node. The function is passed the node’s content, as well as the node itself (used in more complex conversions). It is called in the context of toMarkdown, and therefore has access to the methods detailed below.

The following converter replaces heading elements (h1-h6):

{
  filter: ['h1', 'h2', 'h3', 'h4', 'h5', 'h6'],

  replacement: function(innerHTML, node) {
    var hLevel = node.tagName.charAt(1);
    var hPrefix = '';
    for(var i = 0; i < hLevel; i++) {
      hPrefix += '#';
    }
    return '\n' + hPrefix + ' ' + innerHTML + '\n\n';
  }
}

gfm (boolean)

to-markdown has beta support for GitHub flavored markdown (GFM). Set the gfm option to true:

toMarkdown('<del>Hello world!</del>', { gfm: true });

Methods

The following methods can be called on the toMarkdown object.

isBlock(node)

Returns true/false depending on whether the element is block level.

isVoid(node)

Returns true/false depending on whether the element is void.

outer(node)

Returns the content of the node along with the element itself.

Development

First make sure you have node.js/npm installed, then:

$ npm install --dev
$ bower install --dev

Automatically browserify the module when source files change by running:

$ npm start

Tests

To run the tests in the browser, open test/index.html.

To run in node.js:

$ npm test

Credits

Thanks to all contributors. Also, thanks to Alex Cornejo for advice and inspiration for the breadth-first search algorithm.

Licence

to-markdown is copyright © 2011+ Dom Christie and released under the MIT license.