/docker-pandoc

:whale: Docker container for Pandoc with XeLaTeX

Primary LanguageShellMIT LicenseMIT

Pandoc Docker container with XeLaTeX

This fork is maintained by Cal Evans cal@calevans.com

License: MIT

The code unique to this fork is released under the MIT license. This license does not apply to any additional code used by the project. All programs used in the project are, to the best of my knowledge, open source. They all are covered by different licenses. None of the licenses in any of the programs used cover the books that are generated by the project. Those belong to their respective copyright holder. If you plan on distributing this project, I suggest you carefully investigate all programs being installed and their respective licenses to make sure that you understand them before you attempt to distribute.

What it does

This project will take a Markdown formatted book project and create :

  • HTML
  • PDF
  • EPUB
  • MOBI (Kindle)

Build

To build this container, I use the following in this repo's root.

$ docker build --force-rm --squash --tag buildbook ./

Requirements

For this project to work, several requirements must be met by the book.

Directories

The following directories are required to be present in the root directory of the book's repo.

  • manuscript
  • config
  • pandoc *
  • scripts
    • The pandoc directory is optional. It holds a couple of specific pandoc templates. If the templates do not exist, the program will proceed without them. You can copy the pandoc directory included in this repo and put it in the root level of your book repo. You can then modify the files as needed.

Hooks

The hooks directory can hold scripts that can be called at specific points in the process. The programs can be written in anything. (mine are usually PHP). Hooks need to be self executable. They need to be named according to where you want them to run. Currently supported hooks are:

  • pre_process ROOTDIR
  • post_toc TOC_FILENAME
  • pre_kindle_toc TOC_FILENAME
  • post_kindle_toc TOC_FILENAME

Files

The following files are required for the system to operate.

  • book.yaml All the info about the book that is necessary to create the various files but cannot be derived from the other files. This is divided into three sections, variables, book and manuscriptHere is a sample:
---
variables:
  FINALNAMEROOT: life_badges_by_cal_evans
  VERSION: 1.0.2
  COVERGRAPHIC: cover.png
  TOCDEPTH: 2
book:
  title: Life Badges
  creator:
   - role: author
     text: Cal Evans
  publisher:  E.I.C.C., Inc.
  rights: © 2017 E.I.C.C., Inc. All Rights Reserved
  language: en_US.UTF-8
manuscript:
- foreword.md
- intro.md
- learn.md
- humility.md
- be_a_better_person.md
- be_a_better_developer.md
- create_something.md
- help_someone.md
- greatness.md
- leadership.md
- inspire.md
- conclusion.md
- 99.md
...

The variables section is a simple list of KEY=VALUE pairs. It is dumped into the file /tmp/book.txt and then sourced by the script. At this point, the variables are available to the running script.

book is dumped into book.yaml and is used in the creation of the epub file. manuscript is a list of the files in the order they should appear. These are the file names as they appear in the manuscript/ directory.

  • manuscript/title.md This is the title page of the book. It can be MD and include HTML.

  • pandoc/toc.html OPTIONAL If you are familiar with pandoc and you want to control the text surrounding the table of contents that is generated for the PDF, you can place a file in the pandoc directory named toc.html. It is up to you to understand what can go in this file. Here is a sample:

<div class="pagebreak">
<h1>Table of Contents</h1>
$toc$
</div>

-- pandoc/copyright.html OPTIONAL This is not a pandoc specific file, it is just the file that will be used as a template for your copyright page if it exists. If it exists, we will try to replace <!--DATEPUBLISHED--> and <!--VERSION--> if they exist. Use this file to put any boilerplate you want on the copyright page.

-- pandoc/template.opf OPTIONAL If you are creating a Kindle book, you need to build the opf. Otherwise, you can ignore this.

Process

When the container is properly executed, it will run ./buildbook.sh. This is the starting point. ./builbook.sh will check to make sure that all necessary files are present. It will then generate the output in all three formats.

Linux/macOS

$ docker run --rm -v $PWD:/data buildbook

Windows

docker run --rm -v %cd%:/data buildbook

Version History

  • 1.0.0 First Release
  • 1.1.0 Now supports the latest pandoc as well as TOCDEPTH
  • 1.2.4 Various fixes and such. Now the cover graphic thing should work. Also, ANYTHING we are modifying (toc, cover, copyright) is now looked for in the PANDOC directory.
  • 1.3.0 Now with Kindle processing
  • 1.4.0 HOOKS! WE HAVE HOOKS!
  • 1.4.3 Moved the tmp dir into the main dir but we create it at the start and remove it at the end. ALso, new hooks

Notes

  • As of 1.4.0 epub and PDF TOC still do not generate correctly if you are using Image for chapter headers. Not sure what to do about it.

TODO

  • In pandoc/copyright.html parameratize the Publisher name.

DEBUGGING

If you need to debug the script. Build the docker container, then use this command from the book dir. the last -v should point to development copy of buildbook.sh.

docker run --rm -v $PWD:/data -v ~/Projects/docker-pandoc/buildbook.sh:/usr/local/bin/buildbook.sh buildbook

If you need to execute something inside the container, use this. NOTE: You are not inside of a common running container. You are launching a self-container environment isolated from everything else, even if you use the run command above to run the code, it is not running in this container.

docker run -it --entrypoint bash  -v $PWD:/data -v ~/Projects/docker-pandoc/buildbook.sh:/usr/local/bin/buildbook.sh buildbook