#Markdown parser written in Clojure
Demo
You can try out the parser here.
Installation
A markdown parser that compiles to both Clojure and ClojureScript.
Note: markdown-clj
versions prior to 0.9.68
requires Clojure 1.2+ to run, versions 0.9.68+
require Clojure 1.7.
Usage Clojure
Markdown-clj can be invoked either by calling md-to-html
or md-to-html-string
functions.
The md-to-html
function accepts an input containing Markdown markup and an output where
the resulting HTML will be written. The input and output parameters will be passed to a reader
and a writer respectively:
(ns foo
(:use markdown.core))
(md-to-html "input.md" "output.html")
(md-to-html (input-stream "input.md") (output-stream "test.txt"))
The md-to-html-string
function accepts a string with markdown content and returns a string with the resulting HTML:
(md-to-html-string "# This is a test\nsome code follows\n```clojure\n(defn foo [])\n```")
<h1> This is a test</h1>some code follows<pre><code class="clojure">(defn foo [])
</code></pre>
Both md-to-html
and md-to-html-string
accept optional parameters:
Specifying :heading-anchors
will create anchors for the heading tags, eg:
(markdown/md-to-html-string "###foo bar BAz" :heading-anchors true)
<h3><a name=\"heading\" class=\"anchor\" href=\"#foo_bar_baz\"></a>foo bar BAz</h3>
The code blocks default to a highlight.js compatible format of:
<pre><code class="clojure">some code</code></pre>
Specifying :code-style
will override the default code class formatting for code blocks, eg:
(md-to-html-string "# This is a test\nsome code follows\n```clojure\n(defn foo [])\n```"
:code-style #(str "class=\"brush: " % "\""))
<h1> This is a test</h1>some code follows<pre><code class="brush: clojure">
(defn foo [])
</code></pre>
reference style links
The parser defaults to using inline reference for performance reasons, to enable reference style links pass in the :reference-links? true
option:
(md-to-html-string
"This is [an example][id] reference-style link.
[id]: http://example.com/ 'Optional Title Here'"
:reference-links? true)
footnotes
To enable footnotes, pass the :footnotes? true
option:
(md-to-html-string
"Footnotes will appear automatically numbered with a link to the footnote at bottom of the page [^footnote1].
[^footnote1]: The footnote will contain a back link to to the referring text."
:footnotes? true)
Customizing the Parser
Additional transformers can be specified using the :custom-transformers
key.
A transformer function must accept two arguments.
First argument is the string representing the current line and the second is the map representing the current state.
The default state keys are:
:code
- inside a code section:codeblock
- inside a code block:eof
- end of file:heading
- in a heading:hr
- in a horizontal line:lists
- inside a list:blockquote
- inside a blockquote:paragraph
- in a paragraph:last-line-empty?
- was last line an empty line?
For example, if we wanted to add a transformer that would capitalize all text we could do the following:
(defn capitalize [text state]
[(.toUpperCase text) state])
(markdown/md-to-html-string "#foo" :custom-transformers [capitalize])
<H1>FOO</H1>
Alternatively, you could provide a custom set of transformers to replace the default transformers using the :replacement-transformers
key.
(markdown/md-to-html-string "#foo" :replacement-transformers [capitalize])
This can also be used to add preprocessor transformers. For example, if we wanted to sanitize any image links we could do the following:
(use 'markdown.transformers 'markdown.core)
(defn escape-images [text state]
[(clojure.string/replace text #"(!\[.*?\]\()(.+?)(\))" "") state])
(markdown/md-to-html-string
"foo ![Alt text](/path/to/img.jpg \"Optional Title\") bar [text](http://test)"
:replacement-transformers (cons escape-images transformer-vector))
"<p>foo bar <a href='http://test'>text</a></p>"
Another example would be to escape HTML tags:
(require '[markdown.core :as md])
(require '[markdown.transformer :as mdtrans])
(defn escape-html [text state]
(let [sanitized-text (clojure.string/escape text
{\& "&"
\< "<"
\> ">"
\" """
\' "'"})]
[sanitized-text state]))
(def markdown-with-html
"## I am a title <h1></h1> with HTML tags !\n<script src=\"http://bad-url\">")
(md/md-to-html-string markdown-with-html
:replacement-transformers
(cons escape-html mdtrans/transformer-vector))
<h2>I am a title <h1></h1> with HTML tags !</h2><script src="http://bad-url">
Usage ClojureScript
The ClojureScript portion works the same as above except that the entry function is called md->html
. It accepts
a string followed by the options as its input, and returns the resulting HTML string:
(ns myscript
(:require [markdown.core :refer [md->html]]))
(.log js/console
(md->html "##This is a heading\nwith a paragraph following it"))
(.log js/console
(md->html "# This is a test\nsome code follows\n```clojure\n(defn foo [])\n```"
:code-style #(str "class=\"" % "\"")))
Usage JavaScript
console.log(markdown.core.mdToHtml("##This is a heading\nwith a paragraph following it"));
Supported syntax
Control characters can be escaped using \
\\ backslash
\` backtick
\* asterisk
\_ underscore
\{ curly braces
\}
\[ square brackets
\]
\( parentheses
\)
\# hash mark
\+ plus sign
\- minus sign (hyphen)
\. dot
\! exclamation mark
Basic Elements
Blockquote, Strong, Bold, Emphasis, Italics, Heading, Line, Linebreak, Paragraph, Strikethrough
Links
Automatic Links
This is a shortcut style for creating “automatic” links for URLs and email addresses:
<http://example.com/>
will be turned this into:
<a href="http://example.com/">http://example.com/</a>
Automatic links for email addresses work similarly, except that they are hex encoded:
<address@example.com&>
will be turned into:
<a href=\"address@example.com\">address@example.com</a>
Lists
Code
Code Block, Indented Code, Inline Code
Heading
the number of hashes indicates the level of the heading
# Heading
##Sub-heading
### Sub-sub-heading
headings can also be defined using =
and -
for h1
and h2
respectively
Heading 1
=========
Heading 2
---------
Line
***
* * *
*****
- - -
______
Linebreak
If a line ends with two or more spaces a <br />
tag will be inserted at the end.
Emphasis
*foo*
Italics
_foo_
Strong
**foo**
Bold
__foo__
Blockquote
>
prefixes regular blockquote paragraphs. >-
prefixes a
blockquote footer that can be used for author attribution.
>This is a blockquote
with some content
>this is another blockquote
> Everyone thinks of changing the world,
but no one thinks of changing himself.
>- Leo Tolstoy
Paragraph
This is a paragraph, it's
split into separate lines.
This is another paragraph.
Unordered List
indenting an item makes it into a sublist of the item above it, ordered and unordered lists can be nested within one another. List items can be split over multiple lines.
* Foo
* Bar
* Baz
* foo
* bar
* baz
1. foo
2. bar
more content
## subheading
***
**strong text** in the list
* fuzz
* blah
* blue
* brass
Ordered List
1. Foo
2. Bar
3. Baz
Inline Code
Any special characters in code will be escaped with their corresponding HTML codes.
Here's some code `x + y = z` that's inlined.
Code block
Using three backquotes indicates a start of a code block, the next three backquotes ends the code block section. Optionally, the language name can be put after the backquotes to produce a tag compatible with highlight.js, eg:
```clojure
(defn foo [bar] "baz")
```
Indented Code
indenting by at least 4 spaces creates a code block
some
code
here
note: XML is escaped in code sections
Strikethrough
~~foo~~
Superscript
a^2 + b^2 = c^2
Link
[github](http://github.com)
Reference Link
This is [an example][id] reference-style link.
[id]: http://example.com/ "Optional Title Here"
note: reference links require the :reference-links?
option to be set to true
Footnote
"Footnotes will appear automatically numbered with a link to the footnote at bottom of the page [^footnote1].
[^footnote1]: The footnote will contain a back link to to the referring text."
note: to enable footnotes, the :footnotes?
option must be set to true.
Image
![Alt text](http://server/path/to/img.jpg)
![Alt text](/path/to/img.jpg "Optional Title")
Image Link
[![Continuous Integration status](https://secure.travis-ci.org/yogthos/markdown-clj.png)](http://travis-ci.org/yogthos/markdown-clj)
Limitations
The parser reads the content line by line, this means that tag content is not allowed to span multiple lines.
License
Copyright © 2015 Dmitri Sotnikov
Distributed under the Eclipse Public License, the same as Clojure.