/toc-utils

Python scripts to add toc to pdf files with scanned books

Primary LanguagePython

Библиотека mupdf

Documentation https://pymupdf.readthedocs.io/en/latest/

Discussion about page labels pymupdf/PyMuPDF#782

Page Labels in docs https://pymupdf.readthedocs.io/en/latest/document.html?highlight=labels#Document.set_page_labels

Set TOC in docs https://pymupdf.readthedocs.io/en/latest/document.html?highlight=TOC#Document.set_toc

Get TOC in docs https://pymupdf.readthedocs.io/en/latest/document.html?highlight=TOC#Document.get_toc

OCR PDF page https://pymupdf.readthedocs.io/en/latest/page.html?highlight=get_textpage_ocr#Page.get_textpage_ocr https://github.com/pymupdf/PyMuPDF-Utilities/blob/master/jupyter-notebooks/partial-ocr.ipynb

Файлы данных Tesseract https://tesseract-ocr.github.io/tessdoc/#traineddata-files

Парсинг аргументов командной строки https://docs.python.org/3/howto/argparse.html#id1 https://docs.python.org/3/library/argparse.html