/pdf2htmlEX

Convert PDF to HTML without losing text or format.

Primary LanguageC++OtherNOASSERTION

#pdf2htmlEX

A beautiful demo is worth a thousand words:

Browser requirements

Introduction

pdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies. It aims to provide an accurate rendering, while keeping optimized for Web display.

pdf2htmlEX is best for text-based PDF files, for example scientific papers with complicated formulas and figures. Text, fonts and formats are natively preserved in HTML such that you can still search and copy. The generated HTML file is static, with optional features powered by JavaScript.

Learn more about who and why should use pdf2htmlEX

Features

  • Precise and native text in HTML
  • Flexible Output
  • Moderate Size
  • More PDF stuffs that you love: links, outlines & printing
  • SVG background output & Type 3 font conversion

Learn more
Compare with others

Wiki Portals

Get in Touch

  • Personal messages only. No support for pdf2htmlEX
  • Accepting messages in 中文, English or 日本語

LICENSE

pdf2htmlEX, as a whole package, is licensed under GPLv3. Some resource files are released with relaxed licenses, read LICENSE for more details.

Acknowledgements

pdf2htmlEX is made possible thanks to the following projects:

pdf2htmlEX is inspired by the following projects:

  • pdftops & pdftohtml from poppler
  • MuPDF
  • PDF.js
  • Crocodoc
  • Google Doc

Special Thanks

  • Hongliang Tian
  • Wanmin Liu