This system compiles family journal entries into organized books in HTML, TXT, and PDF formats. The entries are categorized and compiled into individual category files and decade-based books.
posts/- Contains individual journal entries as .txt files, each with a header date linemonthly/- Monthly compilation files (YYYY-MM.txt format)web/- Web interface for browsing and searching journal entries- Individual category files:
AHNS.html,AHNS.pdf,AHNS.txt,J.html,J.pdf,J.txt,US.html,US.pdf,US.txt - Decade books:
book-2013-2019.pdf(US + AHNS),book-2020-YYYY.pdf(US + J, where YYYY is current year) - Combined book:
book.pdf(all categories)
Posts are categorized by filename patterns:
- Pattern: Files containing "AHNS" in the filename
- Example:
2015-09-08-AHNS-2015-09-08.txt - Date range: 2013-2019 (no new posts added)
- Appears in: AHNS.pdf, book-2013-2019.pdf, book.pdf
- Pattern: Files containing "J" in the filename
- Example:
2024-01-20-J-2024-01-20.txt - Date range: 2020-present
- Appears in: J.pdf, book-2020-YYYY.pdf, book.pdf
- Pattern: Files containing "-A-" or "-D-" in the filename
- Example:
2023-12-12-A-2024-01-16.txt - Date range: 2013-present
- Appears in: US.pdf, decade books based on date, book.pdf
Primary compilation script that generates all books.
Process:
- Create build directory for intermediate files
- Generate text files and temporary covers in build directory
- Process files through markdown→HTML→PDF pipeline
- Assemble final books with pdftk
- Generate monthly compilations
- Move final files to main directory and clean up build directory
Final Output Files:
- Decade books:
book-2013-2019.pdf(US + AHNS),book-2020-YYYY.pdf(US + J, where YYYY is current year) - Individual category files: AHNS.{html,pdf,txt}, J.{html,pdf,txt}, US.{html,pdf,txt}
- Combined book:
book.pdf(all categories with covers)
Generates monthly compilation files in the monthly/ directory. Only processes files with -A- or -D- patterns (US category files). Uses single-pass file processing with associative arrays to group files by month.
Each category goes through this pipeline via process_file_type():
- Decode:
python3 -m quopri -d- Decode quoted-printable encoding - Convert:
pandoc -f markdown -t html- Markdown to HTML - Format:
sedcommands - Format for table layout - Style: Combine with
pandoc.cssand process withdow.py(determines day of week) - PDF:
generate_content_pdf.py- Create PDF (8"×10")
Covers are generated temporarily during the build process and cleaned up automatically:
Cover Types:
- Generic cover (title: "Outer Dibblestan") - for combined book main section
- AHNS section divider with date range (e.g., "2013 - 2019") - for AHNS sections
- Uncle J section divider with date range (e.g., "2020 - 2025") - for J sections
- Decade-specific main covers with date ranges - for decade books
Cover format: Large title with smaller date range below on separate line.
Cover generation: Covers are generated during build via generate_cover.py and automatically cleaned up.
Posts follow the pattern: YYYY-MM-DD-[description]-[category]-YYYY-MM-DD.txt
Examples:
2023-12-12-A-2024-01-16.txt(US category)2024-01-20-J-2024-01-20.txt(J category)2015-09-08-AHNS-2015-09-08.txt(AHNS category)
python3- For quopri decoding and dow.py processingpandoc- Markdown to HTML conversionsed- Text processingsponge- From moreutils, for in-place file editingpdftk- PDF concatenationawk- File aggregationuv- Python package manager (for WeasyPrint PDF generation)
./make_omnibus
Creates:
- Decade books: book-2013-2019.pdf (US + AHNS), book-2020-YYYY.pdf (US + J, where YYYY is current year)
- Individual category files: AHNS.{html,pdf,txt}, J.{html,pdf,txt}, US.{html,pdf,txt}
- Combined book: book.pdf (all categories with section covers)
- Monthly compilations: monthly/YYYY-MM.txt files (-A- and -D- files only)
./make_clean
Removes all generated files and directories.
- Build system: Uses temporary
build/directory for intermediate files, cleaned up automatically - Error handling: Uses bash settings (
set -euo pipefail) and extended globbing (shopt -s extglob) - Parallel processing: Most operations run in parallel; file processing avoids command line length limits
- File processing: Single-pass file discovery with associative arrays to group files
- Year extraction: Uses bash parameter expansion to extract years from filename format
posts/YYYY-MM-DD-... - Chronological order: Maintained through YYYY-MM-DD filename prefixes and sorted processing
- PDF generation: Uses WeasyPrint for covers and content (8"×10" page dimensions)
- Decade handling: Adapts to current year for future decades without hardcoding
The web/ directory contains a Flask-based web interface for browsing and searching journal entries.
app.py- Flask web server with search, filtering, and post viewing APIstemplates/index.html- Single-page web application frontendstatic/- CSS and JavaScript for the web interfacezoolog.db- SQLite database with posts and full-text search index
- Timeline visualization - Monthly post counts by category
- Full-text search - Search across all post content with highlighting
- Category filtering - Filter by AHNS, J, or US categories
- Date range filtering - View posts within specific date ranges
- Post navigation - Browse between posts with prev/next within search context
cd web
./app.pyThe web interface will be available at http://localhost:8000