/python-openbook

Proof of concept for an online book parser that sends articles to Readability.com

Primary LanguagePython

Readability Book Parser

Proof of concept for an online book parser that sends articles to Readability.com

Idea

There are so many great programming books out there for free (see http://resrc.io/list/10/list-of-free-programming-books/). For those that are not downloadable as a PDF, I needed a tool to format them nicely and store them so I can easily pull them up on my tablet or phone. This little python script does just that. Put in a URL to parse, define a CSS class to look for (ideally some sort of 'table-of-contents' container, and let the script do the rest.

Usage

Make sure to copy the settings_template.py to create settings.py with your own credentials.

python openbook/main.py -u http://openbook.galileocomputing.de/java7/ -c '.main' -t 'Java 7'
Option Description
u URL with table of contents to parse (required)
c CSS Class or ID to parse, eg. '.main-content' (required)
t Tag name in Readability. Defaults to <title> of site if empty.

Disclaimer

Please note that this is my first ever Python program, so there is probably some room for improvement. Let me know if you find anything obvious! :)

To Do's

  • instead of using Readability for parsing and storing, write own (simple) parser