PyConZA InternetArchive.org Upload Scripts ------------------------------------------ Installation ------------ * 'pip install -r requirements.txt' Instructions ------------ The file 'videos.yaml' describes the files to be uploaded. * Set 'folder: <path-to-the-video-files>'. * Set 'upload: true' for any videos you wish to upload. Set up your access tokens using 'export IAS3_TOKEN=<access-token>:<secret-key>'. You can find these tokens at https://archive.org/account/s3.php after logging in. Then run 'python upload_videos.py'. By default, new uploads are added to the "Community Videos" unless the collection is set in the default_metadata section. PyConZA videos should be uploaded to the 'pyconza' collection. Note that collections are specified by the corresponding IA identifier, not the collection name. You can test uploads to check for issues with the metadata by changing the collection to "test_collection" - these are only kept for 30 days, but be sure to use the "pyconza" collection for the final upload. The helpers directory contains various scripts for helping massage existing data into a suitable videos.yml file. See helpers/README for details. Uploading from YouTube ---------------------- If your videos are currently in a YouTube playlist, you can download them using yt-dlp and then generate a 'videos.yaml' file as follows: * Install yt-dlp by following the instructions at https://github.com/yt-dlp/yt-dlp. * Run 'yt-dlp --write-description --write-info-json --restrict-filenames <youtube-playlist-id-or-url>' * Run './yt-to-yaml.py -y <year> -o pyconza-<year>.yaml yt/*.info.json' Check the generated file carefully, rename it to 'videos.yaml' and then run 'python upload_videos.py' as usual. Resources --------- * https://github.com/kngenie/ias3upload (useful list of relevant metadata fields) * http://archive.org/help/abouts3.txt (canonical documentation on Internet Archive S3 interface) * https://archive.org/developers/ias3.html (developer documentation for the S3 interface) * https://archive.org/account/s3.php (API tokens) * https://github.com/jjjake/internetarchive/ (internetarchive Python package source)