/epub-parser

A powerful yet easy-to-use epub parser

Primary LanguageTypeScript

📖 epub-parser

A powerful yet easy-to-use epub parser

npm version build state

The package exports a simple parser function which use epub file as input and output JavaScript object.

As it is written in TypeScript, types are already included in the package.

Inspired by @gxl/epub-parser but without dependency to fs, which allows it to use this lib with react-native.

Install

npm install davidka/epub-parser --save

or if you prefer yarn

yarn add davidka/epub-parser

Usage

import parser from 'davidka/epub-parser'
// if you use `require` don't forget to add `.default`
// const parser = require('simple-epub-parser').default

console.log('epub content:', parser(binaryData))

parser(target: buffer, options?: object): EpubObject

target

type: buffer

EpubObject

The output is an object which contains structure, sections, info along with some other properties that deals with the epub file. They start with _. I don't recommend using these properties, since they are subscribed to change. They are where they are simply because JavaScript don't have native private member variable support, and sometimes they are helpful for debugging.

structure is the parsed toc of epub file, they contain information about how the book is constructed.

sections is an array of chapters or sections under chapters, they are referred in structure. Each section object contains the raw html string and a few handy methods to help you with you needs. toMarkdown convert the current section to markdown object. toHtmlObjects converts to html object. And a note about src and href, the src and href in raw html stay untouched, but the toHtmlObjects method resolves src to base64 string, and alters href so that they make sense in the parsed epub. And the parsed href is something like #{sectionId},{hash}.

One more thing

It provides some util functions as well.

They can be used via

import { parseLink, parseHTML, parseNestedObject, flattenArray } from 'davidka/epub-parser'
  • parseLink
  • parseHTML
  • parseNestedObject

How to contribute

  • Raise an issue in the issue section.
  • PRs are the best.

❤️