PDF scientific paper translation and bilingual comparison library.
- Provides a simple command line interface.
- Provides a Python API.
- Mainly designed to be embedded into other programs, but can also be used directly for simple translation tasks.
We recommend using the Tool feature of uv to install yadt.
-
First, you need to refer to uv installation to install uv and set up the
PATHenvironment variable as prompted. -
Use the following command to install yadt:
uv tool install --python 3.12 BabelDOC
babeldoc --help- Use the
babeldoccommand. For example:
babeldoc --bing --files example.pdf
# multiple files
babeldoc --bing --files example1.pdf --files example2.pdfWe still recommend using uv to manage virtual environments.
-
First, you need to refer to uv installation to install uv and set up the
PATHenvironment variable as prompted. -
Use the following command to install yadt:
# clone the project
git clone https://github.com/funstory-ai/BabelDOC
# enter the project directory
cd BabelDOC
# install dependencies and run babeldoc
uv run babeldoc --help- Use the
uv run babeldoccommand. For example:
uv run babeldoc --bing --files example.pdf
# multiple files
uv run babeldoc --bing --files example.pdf --files example2.pdfTip
The absolute path is recommended.
--lang-in,-li: Source language code (default: en)--lang-out,-lo: Target language code (default: zh)
Tip
Currently, this project mainly focuses on English-to-Chinese translation, and other scenarios have not been tested yet.
--files: One or more file paths to input PDF documents.--pages,-p: Specify pages to translate (e.g., "1,2,1-,-3,3-5"). If not set, translate all pages--split-short-lines: Force split short lines into different paragraphs (may cause poor typesetting & bugs)--short-line-split-factor: Split threshold factor (default: 0.8). The actual threshold is the median length of all lines on the current page * this factor--skip-clean: Skip PDF cleaning step--dual-translate-first: Put translated pages first in dual PDF mode (default: original pages first)--disable-rich-text-translate: Disable rich text translation (may help improve compatibility with some PDFs)--enhance-compatibility: Enable all compatibility enhancement options (equivalent to --skip-clean --dual-translate-first --disable-rich-text-translate)
Tip
- Both
--skip-cleanand--dual-translate-firstmay help improve compatibility with some PDF readers --disable-rich-text-translatecan also help with compatibility by simplifying translation input- However, using
--skip-cleanwill result in larger file sizes - If you encounter any compatibility issues, try using
--enhance-compatibilityfirst
--qps: QPS (Queries Per Second) limit for translation service (default: 4)--ignore-cache: Ignore translation cache and force retranslation--no-dual: Do not output bilingual PDF files--no-mono: Do not output monolingual PDF files--openai: Use OpenAI for translation (default: False)--bing: Use Bing for translation (default: False)--google: Use Google Translate for translation (default: False)
Tip
- You must specify one translation service among
--openai,--bing,--google. - It is recommended to use models with strong compatibility with OpenAI, such as:
glm-4-flash,deepseek-chat, etc. - Currently, it has not been optimized for traditional translation engines like Bing/Google, it is recommended to use LLMs.
--openai-model: OpenAI model to use (default: gpt-4o-mini)--openai-base-url: Base URL for OpenAI API--openai-api-key: API key for OpenAI service
Tip
- This tool supports any OpenAI-compatible API endpoints. Just set the correct base URL and API key. (e.g.
https://xxx.custom.xxx/v1) - For local models like Ollama, you can use any value as the API key (e.g.
--openai-api-key a).
--output,-o: Output directory for translated files. If not set, use current working directory.--debug,-d: Enable debug logging level and export detailed intermediate results in~/.cache/yadt/working.
--config,-c: Configuration file path. Use the TOML format.
Example Configuration:
[babeldoc]
debug = true
lang-in = "en-US"
lang-out = "zh-CN"
qps = 20
# this is a comment
# pages = 4
openai = true
openai-model = "SOME_ALSOME_MODEL"
openai-base-url = "https://example.example/v1"
openai-api-key = "[KEY]"
# All other options can also be set in the configuration file.You can refer to the example in main.py to use BabelDOC's Python API.
Please note:
-
Make sure call
babeldoc.high_level.init()before using the API -
The current
TranslationConfigdoes not fully validate input parameters, so you need to ensure the validity of input parameters
There are a lot projects and teams working on to make document editing and translating easier like:
There are also some solutions to solve specific parts of the problem like:
- layoutreader: the read order of the text block in a pdf
- Surya: the structure of the pdf
This project hopes to promote a standard pipeline and interface to solve the problem.
In fact, there are two main stages of a PDF parser or translator:
- Parsing: A stage of parsing means to get the structure of the pdf such as text blocks, images, tables, etc.
- Rendering: A stage of rendering means to render the structure into a new pdf or other format.
For a service like mathpix, it will parse the pdf into a structure may be in a XML format, and then render them using a single column reader order as layoutreader does. The bad news is that the original structure lost.
Some people will use Adobe PDF Parser because it will generate a Word document and it keeps the original structure. But it is somewhat expensive. And you know, a pdf or word document is not a good format for reading in mobile devices.
We offer an intermediate representation of the results from parser and can be rendered into a new pdf or other format. The pipeline is also a plugin-based system which everybody can add their new model, ocr, renderer, etc.
- Add line support
- Add table support
- Add cross-page/cross-column paragraph support
- More advanced typesetting features
- Outline support
- ...
Our first 1.0 version goal is to finish a translation from PDF Reference, Version 1.7 to the following language version:
- Simplified Chinese
- Traditional Chinese
- Japanese
- Spanish
And meet the following requirements:
- layout error less than 1%
- content loss less than 1%
- Parsing errors in the author and reference sections; they get merged into one paragraph after translation.
- Lines are not supported.
- Does not support drop caps.
We encourage you to contribute to YADT! Please check out the CONTRIBUTING guide.
Everyone interacting in YADT and its sub-projects' codebases, issue trackers, chat rooms, and mailing lists is expected to follow the YADT Code of Conduct.

