This projects uses the Python package Scrapy to download a chosen The Smashing Pumpkins song from the Internet Archive. 🎃
- Python 3.8 or above
- (optional) Poetry
Note: If you'd like to use Docker, skip to Running with Docker.
If you have GNU Make and Poetry installed, you can install the project via make install
. If you don't have Poetry and GNU Make, you can create a virtual environment using the requirements.txt
file. For example:
python3 -m venv venv
source venv/bin/activate
orvenv\Scripts\activate
in Windowspip install -r requirements.txt
To download a song, run the Bash script via bin/download <search-term>
. For example, bin/run "set the ray to jerry"
. It will then search the archive for recordings of that song and download them to the directory songs/<searh-term>
. Specifically, the Scrapy spider will crawl through all search results download audio files where the search term string appears in the file URL. Note: you can run the Scrapy spider directly with scrapy crawl pumpkin -a search_term=<search-term>
.
To run with Docker, you'll need to first build the image and then run a container using that image. You can run make docker-image
to build the image or run:
docker build -t pumpkin --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) .
To run the container, you can use the Bash script bin/docker-run <search-term>
or run:
docker run --rm -v $PWD/songs:/project/songs pumpkin <search-term>
See the Usage section for more information about how <search-term>
works.
This project is distributed under the MIT license. Please see LICENSE
for more information.