/polar-bookshelf

Polar is a personal knowledge repository for PDF and web content supporting incremental reading and document annotation.

Primary LanguageTypeScript

Polar Bookshelf



Polar Bookshelf is an incremental reading and personal knowledge repository for PDF and the web created using the Electron framework and PDF.js

Features

  • PDF support We have first-class PDF support thanks to PDF.js. PDFs work well when reading content in book format or when reading scientific research which is often stored as PDF.

  • Captured Web Pages Download HTML content and save them as offline documents which can be annotated.

  • Pagemarks Easily keep track of what you're reading and the progress of each document.

  • Text Highlights Highlight text in PDF and web pages.

  • Area Highlights Capture a region of the page as a highlight which can be a chart, figure, infographic, etc.

  • Local Storage All content is stored locally. You can also use a system like git or Dropbox to transfer your repository across machines.

  • Hackable The entire system is based on Electron, Node, pdf.js, React and other web standards. If you're a developer - welcome home!

  • Standards Based All content is stored as JSON in a well documented schema. Annotations never mutate the original content.

  • Portable Run across any platform. Linux, MacOS, and Windows supported. We also product snaps which means you can install our .deb files on Ubuntu or Debian but also any Linux distribution that supports snaps!

Screenshots

PDF Document Polar has excellent PDF support.

Captured Web Content

Captured Web Content Polar supports fetching and storing web content locally for annotating.

Annotations

Annotations Annotating a PDF including pagemarks showing content already read, an area highlight, and a text highlight.

Repository

Repository Polar includes a document repository manager to manage all your documents, open up a new editor, sort them as a queue or by priority, etc.

Packages for Windows, MacOS, and Linux are available in the releases page.

Note that our packages are NOT currently signed so you will receive an error on Windows and MacOS before installation.

We also have a CHANGELOG available if you're interested into what went into each release.

Discussion

We have both a Discord group and Reddit group if you want to discuss Polar.

If it's a very technical issue it might be best to create a Github Issue.

Personal Knowledge Repository

Polar is a document manager for PDF and web content as well as a personal knowledge repository.

It allows you to keep all important reading material in one place including annotations and flashcards for spaced repetition.

It supports for features like pagemarks, text highlights, and progress tracking by keeping track of how much you've read including restoring pagemarks when you re-open documents.

Pagemarks are a new concept for tracking your reading inspired from incremental reading. They allow suspend and resume of reading for weeks and months in the future until you're ready to resume, without losing your place.

Since you can create multiple pagemarks they work even if you jump around in a book (which is often in technical or research work).

Web Content

PDF is an excellent document but we've found that many HTML pages don't convert to PDF well since they were not intended to be printed.

Captured pages contain HTML content stored in phz (polar HTML zip) files.

We fetch all resources, render the page as DOM and apply CSS, then de-activate the page by removing all scripts.

We then store the content in the phz archive format and serve the content directly to Electron.

This means you have long term storage for all your content. You can annotate it and use pagemarks without risk of the content changing.

To capture a new page just select File | Capture Web Page then enter a URL.

After that the page will be captured and then loaded.

Local Storage

All annotations, documents, PHZ files and other data are persisted on disk in your ~/.polar directory (different on each platform) and when you re-open a PDF or PHZ file your pagemarks and other annotations are restored.

Since storage is local you're not reliant on one specific cloud provider. You can also use tools like git or Dropbox to synchronize across machines.

Pagemarks

Pagemarks provide a way for you to keep track of your reading by marking portions of your document as 'read'. You can have multiple pagemarks per document.

Additionally there is a progress bar that tracks the progress of the document based on the number of pagemarks you've created.

Right now usage is only via keyboard bindings:

Linux / Windows key bindings

  • Control Alt N - create a new pagemark on the current page
  • Control Alt click - create a pagemark on the page up until the current mouse click
  • Control Alt E - erase the current pagemark

MacOS Key bindings

  • Meta-Command N - create a new pagemark on the current page
  • Meta-Command click - create a pagemark on the page up until the current mouse click
  • Meta-Command E - erase the current pagemark

Text Highlights

Text Highlight

Text highlights allow you to work with content like you're using a text highlighter in a book.

Create a text highlight.

Select text you want to highlight then hit Ctrl-Alt-T

Delete a text highlight.

Right click the highlight and select delete.

Key bindings:

  • Ctrl-Alt-T - create a new text highlight from the current selected text.

Area Highlights

Area Highlight

Area highlights allow you highlight a figure, infographic, or anything visual in a document.

Create an area highlight.

Right click on a page and select "Create area highlight"

Delete an highlight.

Right click the highlight and select delete.

Flashcards

Flashcards

Flashcards allow you to retain information long term by using a spaced repetition system like Anki to continually re-train yourself on material you want to retain.

Flashcards can be created by right clicking an annotation and selecting "Create Flashcard". The resulting flashcards are stored as annotations in your repository.

Status

This is currently a beta feature and we're working on implementing Anki sync to enable spaced repetition. Any flashcards created now will be stored with Anki in the future.

Polar is very reliable to use for day to day PDF and web content annotation.

We're expecting to release a 1.0 in Sept 2018 with Anki sync support and initial annotation support.

Hackable

Since the entire platform is based on Electron (Node + Chromium) the platform is very easy to work with which means developers can contribute easily.

Feel free to fork and send a pull request if there's some interesting feature you would like to add.

Data

All data is stored on disk in JSON format. This also includes extracted metadata from the document. For example, text highlights include the source text that you copied as well as pointers into the original document where they can be found.

Pending Features

We're currently working on landing a few key features which are halfway implemented including:

  • A rework of text highlights for PDFs

  • Thumbnails of highlights (text + area) stored in .json

  • Flashcard integration with Anki support. The flashcard UI is mostly complete but I need feedback on the design.

Check our our roadmap to see if there are any features you need pending for a future release.

Feel free to jump on any of these issues if you're a developer and would like to implement them yourself.

Donations also help to support the project and encourage specific features.

Principles

We believe the following design principles are core to seeing this as a successful project.

  • All the data should support long term file formats. The on disk format we use is JSON.

  • Portability to all platforms is critical. We're initially targeting Linux (Ubuntu), MacOS, and Windows. You shouldn't have to pick a tool, which you might be using for the next 5-10 years, and then get stuck to a platform which may or may not exist in the future.

Build from source

Install NodeJS and npm for your platform.

To run:

$ git clone https://github.com/burtonator/polar-bookshelf
$ cd polar-bookshelf
$ npm install && npm start

Donations

If you'd like to donate we accept Bitcoin at bc1q059asaaqjt5cultx993gfytjssj4g6fw3q8n7g

All donations go to supporting Polar which include website hosting costs, web designer costs, continual integration services, etc.

License

Polar is distributed under the GPL license.

PDF.js is available under Apache License. Electron is released under MIT License. Rest of the code is MIT licensed.

Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY