/internet_archive_video_uploader

Helper scripts for uploading videos to archive.org

Primary LanguagePython

PyConZA InternetArchive.org Upload Scripts
------------------------------------------

Installation
------------

* 'pip install -r requirements.txt'

Instructions
------------

The file 'videos.yaml' describes the files to be uploaded.

  * Set 'folder: <path-to-the-video-files>'.
  * Set 'upload: true' for any videos you wish to upload.

Set up your access tokens using 'export
IAS3_TOKEN=<access-token>:<secret-key>'. You can find these tokens at
https://archive.org/account/s3.php after logging in.

Then run 'python upload_videos.py'.

By default, new uploads are added to the "Community Videos" unless the
collection is set in the default_metadata section. PyConZA videos
should be uploaded to the 'pyconza' collection. Note that collections are
specified by the corresponding IA identifier, not the collection name.

You can test uploads to check for issues with the metadata by changing the
collection to "test_collection" - these are only kept for 30 days, but be sure
to use the "pyconza" collection for the final upload.

The helpers directory contains various scripts for helping massage existing
data into a suitable videos.yml file. See helpers/README for details.

Uploading from YouTube
----------------------

If your videos are currently in a YouTube playlist, you can download them
using yt-dlp and then generate a 'videos.yaml' file as follows:

  * Install yt-dlp by following the instructions at
    https://github.com/yt-dlp/yt-dlp.

  * Run 'yt-dlp --write-description --write-info-json --restrict-filenames
    <youtube-playlist-id-or-url>'

  * Run './yt-to-yaml.py -y <year> -o pyconza-<year>.yaml yt/*.info.json'

Check the generated file carefully, rename it to 'videos.yaml' and then run
'python upload_videos.py' as usual.

Resources
---------

* https://github.com/kngenie/ias3upload (useful list of relevant metadata fields)
* http://archive.org/help/abouts3.txt (canonical documentation on Internet Archive S3 interface)
* https://archive.org/developers/ias3.html (developer documentation for the S3 interface)
* https://archive.org/account/s3.php (API tokens)
* https://github.com/jjjake/internetarchive/ (internetarchive Python package source)