About
We build books in a pipeline of steps. These steps are written in different languages and need to run on a server as well as locally for development.
Table of Contents
- About
- Table of Contents
- Local Instructions
- Steps
- Environment Variables
- Features
- Development
- CI Integration
- Running the Enki command in Gitpod
- Create a Webhosting Pipeline
- TODO list
Local Instructions
This uses a little wrapper to hide all the docker commands.
Note: See docs-pipeline.md to know what each of the steps do and which files they expect/generate.
Clone & Install Submodules
- Clone this repositoriy
- Ensure submodules are checked out by running
./install-submodules.sh
- (Optional) Copy the .env.example and fill in the values and source the environment.
- Run the tests:
./test.sh
to verify everything works
Then try the following to build PDFs and other formats:
# All-in-one Git-based books
# CLI command repo_name book_slug (optional)
./enki --command all-pdf --repo 'philschatz/tiny-book' --book-slug 'book-slug1' --ref main
./enki --command all-web --repo 'philschatz/tiny-book' --ref main
./enki --command all-epub --repo 'openstax/osbooks-introduction-anthropology' --book-slug 'introduction-anthropology' --ref main
# GH_SECRET_CREDS='..' before running enki for private repositories
Note that if book_slug
is not provided, all books in the repo will be built by default.
REX preview
For REX preview you need to build a all-web
book and run rex-preview command.
Important Note: You need to turn off any sort of adblocker for https://rex-web.herokuapp.com
for the preview to work! Don't use Safari because the internal tracker blocker there also breaks REX preview!
Example with using a custom data "rextest" directory:
./enki --command all-web --repo 'philschatz/tiny-book' --data-dir ./data/rextest/ --book-slug 'book-slug1' --ref main
# Start up a local REX Preview (after running all-web)
./enki --command rex-preview --data-dir ./data/rextest/
The rex-preview command will first open an error page for all collections built which can be opened in the browser. To navigate to the real modules use the navigation on left side.
Ctrl+C
will stop rex-preview.
A note on caching: if your browser is not displaying local HTML changes properly, it may be caching the original version. To prevent this, 1) make sure caching is disabled when developer tools are opened, 2) open dev tools.
Disable for Firefox: https://stackoverflow.com/a/39869243/756056 Disable for Chrome: https://stackoverflow.com/a/7000899
local preview of HTML files
For local preview of HTML files with working image resources you need to build a all-web
book and run local-preview
command.
Example with using a custom data localptest
directory:
./enki --command all-web --repo 'philschatz/tiny-book' --data-dir ./data/localptest/ --book-slug 'book-slug1' --ref main
# Start up a local preview (after running all-web)
./enki --command local-preview --data-dir ./data/localptest/
The local-preview
command will show you that you can open the HTML files in directory ./data/localptest/contents
. Simply change cd
into this directory and open the html files in a browser.
Private Repositories
To clone private repositories there are 2 options:
Set a GitHub token
This is how production works but it can be annoying to clone a whole book all the time.
You will need to set a GH_SECRET_CREDS
environment variable. To do that, create a token at https://github.com/settings/tokens and ensure the repo
scope is selected. Then, set GH_SECRET_CREDS={the_token}:x-oauth-basic
in the .env
file
in this directory and run the cli.
Sideload the book
If you already have the files locally, you can specify a path to them and the CLI can sideload them in instead of fetching from GitHub.
To use this method, use the --sideload
argument to specify where to load the book from.
Google Docs
To upload DOCX files to Google Docs follow the instructions here. To build a Google Docs pipeline, run npm run build:gdocs
in ./build-concourse/
Steps
Here is an autogenerated graph of all the steps to build a PDF using all-pdf
. For more see the graphs here:
graph TB
step-fetch -- fetched --> step-prebake
step-fetch -- fetched --> step-bake
step-prebake -- assembled --> step-bake
step-prebake -- resources --> step-bake
step-fetch -- fetched --> step-postbake
step-prebake -- assemble-meta --> step-postbake
step-bake -- baked --> step-postbake
step-postbake -- linked --> step-pdf
step-bake -- baked --> step-pdf
step-prebake -- fetch-meta --> step-pdf
step-prebake -- resources --> step-pdf
step-fetch -- fetched --> step-upload-pdf
step-pdf -- artifacts --> step-upload-pdf
Environment Variables
This project uses environment variables extensively to set things like:
- Trace Logging
- Mounting a custom
cookbook/
directory - Skipping certain steps
- AWS authentication credentials
- Google authentication credentials
See ./.env.example for all the environment variables and examples.
Artifact and Queue Buckets
The book pipeline uses S3 buckets to accomplish two different goals: Uploading artifacts and keeping the state of which books have been built (webhosting pipeline)
Name | Use | Description |
---|---|---|
WEB_QUEUE_STATE_S3_BUCKET |
Queue | When a book has completed building, the file in this bucket is updated |
S3_QUEUE |
Queue | Webhosting and GDocs pipeline stores a queue of books that need to be built. It is one file with multiple versions |
CORGI_ARTIFACTS_S3_BUCKET |
State | CORGI job artifacts (PDF, Webhosting JSON files, GDocs DOCX files) are uploaded here |
WEB_S3_BUCKET |
State | Webhosting JSON files are uploaded to this bucket |
Features
- Dependency Graph explicitly in code (concourse steps, local steps, and graphical documentation are all kept in sync)
- Scripts for the build steps allows using a Dockerfile as well as rebuilding if code is updated inside the container (e.g. developer runnig the container in gitpod or vscode)
- Integration tests and code coverage
In order to support both use-cases, all the steps are included in one Docker container and parameters are specified as environment variables.
Input/output directores for each step are specified as environment variables because local development does not need different directories but the production builds in concourse-CI use different input/output directories for each step.
The code is organized as follows:
- Dockerfile contains a multi-stage build and builds all the code necessary to build a PDF, webhosting JSON, or DOCX files ready to upload to Google Docs
- dockerfiles/build/ scripts that build the code in the container
- dockerfiles/steps/ contains the code for each step (e.g. fetch, assemble, bake, mathify) as well as convenience
all-*
steps which are only for local development - enki is the start for developers building books locally on their machine
- build-concourse/ contains scripts (
npm start
) that generate Concourse Pipeline YAML files for the different CORGI environments (production, staging, local) and different webhosting environments (production, sandbox, local) - step-definitions.ts The dependency graph for all builds
- step-config.json Autogenerated file so the dependency graph can be used inside the container in shell scripts
- graphs/ Autogenerated graphs showing each step in the pipelines
Development
In order to debug steps inside the container you can start a shell instead of running a step. To run a shell inside the container run the following:
./enki --command shell
Then, to run the step inside the container, run docker-entrypoint.sh {step_name}
.
Run Tests
- Run
./test.sh
- Open
./coverage/index.html
in a browser to see coverage
Run steps beginning with a step
Often, developers would like to rerun only part of the pipeline (e.g. bake
) to skip expensive steps like fetching the book contents.
Use the START_AT_STEP=
environment variable. Example:
START_AT_STEP=git-bake ./enki ./data/tin-bk all-pdf
Note: The arguments following all-pdf can be omitted since they are only used in the initial step
Debugging
Python Debugging
In order to debug a python script from the ./bakery-src/scripts
folder, you can add the --pdb
flag to the ./enki
command. This will open the 5678
port on the container and allow you to connect to it with a debugger.
In addition, you should add the following script on top of the python script you want to debug and set breakpoints below the debug()
line:
from .profiler import debug
debug()
Java Debugging
In order to debug the Java code, you can set the breakpoints in the Java code and then run the ./enki
command with the --jdb
flag. The java program will then suspend till the client is connected to it and stop at the breakpoints and you can step through the code.
Javascript Debugging
In order to debug the Javascript code, you can set the breakpoints in the Javascript code under bakery-js/src
and then run the ./enki
command with the --jsdb
flag. The node program will then suspend till the client is connected to it and stop at the breakpoints and you can step through the code.
CI Integration
This repo can be used as the image in a gitpod environment. All of code to build the variaous dependencies (like cookbook
) are in bash scripts inside the container so they can be run from within the container (see Dockerfile)
Running the Enki command in Gitpod
- Open the Enki Repository in Gitpod
- Follow Local Instructions to build a book
- NOTE: You will want to set a
--data-dir
that is relative to your working directory (like./data
) so that artifacts are easy to download/preview
- NOTE: You will want to set a
Gitpod Differences
- Local preview can be utilized via the Live Server extension that should be preinstalled in the Gitpod workspace
- Starting Live Server
- Open the command palette and run "Open with Live Server"
- OR Right click an html document and click "Open with Live Server"
- OR CMD + L followed by CMD + O
- When you start Live Server, a new tab that shows a directory listing should open in your browser
- If a tab does not open automatically
- Open your terminal (Control + ` OR
Create New Terminal
in command palette) - If you do not see a
Ports
tab above your terminal, right click the bar and check thePorts
option - Click ports. You should see that port 5500 is open. If not, make sure that Live Server is running.
- Click the address or the Globe icon. That should open the aforementioned directory listing
- Open your terminal (Control + ` OR
- If a tab does not open automatically
- Navigate to your local preview.
- In my case, I ran
./enki --command local-preview --data-dir ./data/localptest
, so my local preview was indata/localptest/local-preview/contents
- In my case, I ran
- Starting Live Server
- Build artifacts need to be downloaded
You can download build artifacts like pdfs and zips by locating them in the explorer panel, right clicking them, and click
Download...
Submodule Development
- Submodules like nebuchadnezzar, cookbook, etc. can be used like any other git repository. You can checkout branches, modify files, pull or push changes, and anything else you might do with a git repository.
- When a submodule is changed, the Enki image should automatically rebuild the layers that were affected by the change.
Create a Webhosting Pipeline
Find out which codeversion to use (<codeversion>
). Then run the following in the ./build-concourse/ directory:
# cd ./build-concourse
npm install
# Build the concourse pipeline YAML file
CODE_VERSION=<codeversion> npm run build
# Upload
fly --target=prod set-pipeline --pipeline=webhost-prod-<codeversion> --config=./webhosting-production.yml
# Unpause the pipeline (via the UI or via the command that prints out)
TODO list
- lint TypeScript files
- get vscode to recognize the filesystem imports in the TypeScript files
- remove the
git-
prefix from tasks so they wil ljust work when we remove archive tasks - remove virtualenv and install python packages to the system (unless it's bad practice)
Future TODO work
- shellcheck entrypoint bash scripts
- Read book list from
META-INF/books.xml
instead ofls *.collection.xml
using xmlstarlet - Consistent if;then, quotes (or not) around variables, and curly braces around variables
- move pm2 into bakery-scripts/ instead of being installed globally in the Dockerfile
- move auth secret rotation into this repo. See openstax/corgi#355