/koc_khan_offline

Tools for pulling the Khan Academy videos offline for use with the Kids on Computers project

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

This is a project for Kids on Computers to give access to Khan Academy offline

There are several tools available here:

http://www.khanacademy.org/downloads

However, the closest existing tool is designed for running a local instance of the website in
Windows (using google app engine) and so this version of the tool took the python scripts from 
http://code.google.com/p/khanacademy/downloads/detail?name=KhanAcademy-1482.zip&can=2&q= and have been adapting them to

a) work with other video formats (mp4, ogg) that exist on archive.org
b) allow for updating of the canonical list of available videos based on the archive.org rss feed
c) provide a command-line interface for interacting with this tool
d) use json output from archive.org to build the information instead of a hard-coded mapping that can get out of date

TODO:

* Check for file type of url - is video _good_ is html _bad_ (don't download those!)
* keep a list of what doesn't resolve at the time and look at why
* gather all correct video urls, write to file so that there's persistence in getting all new vids
* SMS if there are updates available (twilio)
* Create our own 'readable_id' in the mappings for new additions based on title.tolower() s/ /_
* integrate youTube downloading not as an exception handling case
* better handling for keyboard interupt while downloading (incomplete file downloads)
* easter egg - download one monty python video :)