/pandoc-reader

Pandoc Reader is a Pelican plugin that processes Markdown content via Pandoc

Primary LanguagePython

Pandoc Reader: A Plugin for Pelican

Build Status PyPI Version License

Pandoc Reader is a Pelican plugin that converts documents written in Pandoc’s variant of Markdown into HTML.

Requirements

This plugin requires:

By default, the plugin looks for a pandoc executable on your PATH. If you wish, you may specify an alternative location for your pandoc executable.

Installation

This plugin can be installed via:

python -m pip install pelican-pandoc-reader

Configuration

This plugin converts Pandoc’s variant of Markdown into HTML. Conversion from other Markdown variants is supported but requires the use of a Pandoc defaults file.

Converting to output formats other than HTML is not supported.

Specifying File Metadata

The plugin expects all Markdown files to start with a YAML-formatted content header, as shown below.

---
title: "<post-title>"
author: "<author-name>"
data: "<date>"
---

… or …

---
title: "<post-title>"
author: "<author-name>"
date: "<date>"
...

⚠️ Note: The YAML-formatted header shown above is syntax specific to Pandoc for specifying content metadata. This is different from Pelican’s front-matter format. If you ever decide to stop using this plugin and switch to Pelican’s default Markdown handling, you may need to switch your front-matter metadata to Python-Markdown’s Meta-Data format.

If you have files that use Pelican's front matter format, there is a script written by Joseph Reagle available that converts Pelican's front matter to Pandoc's YAML header format.

For more information on Pandoc's YAML metadata block or Pelican's default metadata format please visit the links below:

Specifying Pandoc Options

The plugin supports two mutually exclusive methods for passing options to Pandoc.

Method One: Via Pelican Settings

The first method involves configuring two settings in your Pelican settings file (e.g., pelicanconf.py):

  • PANDOC_ARGS
  • PANDOC_EXTENSIONS

In the PANDOC_ARGS setting, you may specify any arguments supported by Pandoc, as shown below:

PANDOC_ARGS = [
    "--mathjax",
    "--citeproc",
]

In the PANDOC_EXTENSIONS setting, you may enable/disable any number of the supported Pandoc extensions:

PANDOC_EXTENSIONS = [
    "+footnotes",   # Enabled extension
    "-pipe_tables", # Disabled extension
]

Method Two: Using Pandoc Defaults Files

The second method involves specifying the path(s) to one or more Pandoc defaults files, with all your preferences written in YAML format.

These paths should be set in your Pelican settings file by using the setting PANDOC_DEFAULTS_FILES. The paths may be absolute or relative, but relative paths are recommended as they are more portable.

PANDOC_DEFAULTS_FILES = [
    "<path/to/defaults/file_one.yaml>",
    "<path/to/defaults/file_two.yaml>",
]

Here is a minimal example of content that should be available in a Pandoc defaults file:

reader: markdown
writer: html5

Using defaults files has the added benefit of allowing you to use other Markdown variants supported by Pandoc, such as CommonMark and GitHub-Flavored Markdown.

Please see Pandoc defaults files for a more complete example.

⚠️ Note: Neither method supports the --standalone or --self-contained arguments, which will yield an error if invoked.

Generating a Table of Contents

If you want to create a table of contents (ToC) for posts or pages, you may do so by specifying the --toc or --table-of-contents argument in the PANDOC_ARGS setting, as shown below:

PANDOC_ARGS = [
    "--toc",
]

… or …

PANDOC_ARGS = [
    "--table-of-contents",
]

To add a ToC via a Pandoc defaults file, use the syntax below:

table-of-contents: true

The table of contents will be available for use in templates using the {{ article.toc }} or {{ page.toc }} Jinja template variables.

Enabling Citations

You may enable citations by specifying the -C or --citeproc option.

Set the PANDOC_ARGS and PANDOC_EXTENSIONS in your Pelican settings file as shown below:

PANDOC_ARGS = [
    "--citeproc",
]

… or …

PANDOC_ARGS = [
    "-C",
]

If you are using a Pandoc defaults file, you need the following as a bare minimum to enable citations:

reader: markdown
writer: html5

citeproc: true

Without these settings, citations will not be processed by the plugin.

It is not necessary to specify the +citations extension since it is enabled by default. However, if you were to disable citations by specifying -citations in PANDOC_EXTENSIONS or by setting reader: markdown-citations in your defaults file, citations will not work.

You may write your bibliography in any format supported by Pandoc with the appropriate extensions specified. However, you must name the bibliography file the same as your post.

For example, a post with the file name my-post.md should have a bibliography file called my-post.bib, my-post.json, my-post.yaml or my-post.bibtex in the same directory as your post, or in a subdirectory of the directory that your blog resides in. Failure to do so will prevent the references from being picked up.

Known Issues with Citations

If enabling citations with a specific style, you need to specify a CSL (Citation Style Language) file, available from the Zotero Style Repository. For example, if you are using ieee-with-url style file, it may be specified in your Pelican settings file, as shown below:

PANDOC_ARGS = [
   "--csl=https://www.zotero.org/styles/ieee-with-url",
]

Or in a Pandoc defaults file:

csl: "https://www.zotero.org/styles/ieee-with-url"

Specifying a remote (that is, not local) CSL file as shown above dramatically increases the time taken to process Markdown content. To improve processing speed, it is highly recommended that you use a local copy of the CSL file downloaded from Zotero.

You may then reference it in your Pelican settings file as shown below:

PANDOC_ARGS = [
   "--csl=path/to/file/ieee-with-url.csl",
]

Or in a Pandoc defaults file:

csl: "path/to/file/ieee-with-url.csl"

Calculating and Displaying Reading Times

This plugin may be used to calculate the estimated reading time of articles and pages by setting CALCULATE_READING_TIME to True in your Pelican settings file:

CALCULATE_READING_TIME = True

You may display the estimated reading time using the {{ article.reading_time }} or {{ page.reading_time }} template variables. The unit of time will be displayed as “minute” for reading times less than or equal to one minute, or “minutes” for those greater than one minute.

The reading time is calculated by dividing the number of words by the reading speed, which is the average number of words read in a minute.

The default value for reading speed is set to 200 words per minute, but may be customized by setting READING_SPEED to the desired words per minute value in your Pelican settings file:

READING_SPEED = <words-per-minute>

The number of words in a document is calculated using the wordcount Lua Filter.

Customizing the Path for the pandoc Executable

If your pandoc executable does not reside on your PATH, set the PANDOC_EXECUTABLE_PATH in your Pelican settings file to the absolute path of where your pandoc resides as shown below:

PANDOC_EXECUTABLE_PATH = /path/to/my/pandoc

This setting is useful in cases where the pandoc executable from your hosting provider is not recent enough, and you may need to install a version of Pandoc-compatible with this plugin—in a non-standard location.

Contributing

Contributions are welcome and much appreciated. Every little bit helps. You can contribute by improving the documentation, adding missing features, and fixing bugs. You can also help out by reviewing and commenting on existing issues.

To start contributing to this plugin, review the Contributing to Pelican documentation, beginning with the Contributing Code section.

Special thanks to Justin Mayer, Erwin Janssen, Joseph Reagle and Deniz Turgut for their improvements and feedback on this plugin.

License

This project is licensed under the AGPL-3.0 license.