/pdf2htmlEX

Convert PDF to HTML without losing text or format.

Primary LanguageC++OtherNOASSERTION

#pdf2htmlEX

A beautiful demo is worth a thousand words:

Browser requirements

Introduction

pdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies. It aims to provide an accurate rendering, while keeping optimized for Web display.

pdf2htmlEX is best for text-based PDF files, for example scientific papers with complicated formulas and figures. Text, fonts and formats are natively preserved in HTML such that you can still search and copy. The generated HTML file is static, with optional features powered by JavaScript.

Learn more about who and why should use pdf2htmlEX

Features

  • Precise and native text in HTML
  • Flexible Output
  • Moderate Size
  • More PDF stuffs that you love: links, outlines & printing
  • Experimental: SVG background output & Type 3 font conversion

Learn more
Compare with others

Wiki Portals

Resources

Contact

Note: Your message will be most likely ignored if you failed to follow this guidance.

  • Accepting messages in 中文, English or 日本語.

LICENSE

pdf2htmlEX, as a whole package, is licensed under GPLv3 with additional terms (see below). Some resource files are released with relaxed licenses, read LICENSE for more details.

For Online Services

You are free and welcome to modify pdf2htmlEX for your online services, but you should credit pdf2htmlEX if your service involves "online conversion" facilitated by pdf2htmlEX. You are also encouraged to send me a name and a URL for the purpose of statistics.

Read LICENSE for more detail.

Acknowledgements

pdf2htmlEX is made possible thanks to the following projects:

pdf2htmlEX is inspired by the following projects:

  • pdftops & pdftohtml from poppler
  • MuPDF
  • PDF.js
  • Crocodoc
  • Google Doc

Special Thanks

  • Hongliang Tian
  • Wanmin Liu