Rip-it with Rippy
Rippy is a downloader designed to scrape websites using a real web browser to find, e.g. video or downloadable files. The targets are website that try to be scrape-resistant and where other downloaders had to give up.
The magic is that Rippy uses a real browser it controls so a lot of the normal anti-bot designs are inefficient, e.g. scrambling javascript. To block Rippy you will have to block browsers. I also enjoy a blocking arms-race, keeps my day bright and fulfilled.
Currently the only distribution method officially provided is the docker-compose way but all it really requires is Chrome and Python.
wget https://github.com/JohnDoee/rippy-docker/raw/master/docker-compose.yml
You should edit docker-compose.yml. The following values should be changed
- /tmp/media should be changed to where you want rippy to download data, it is in the file twice.
- BASIC_AUTH_PASSWORD should be changed to a unique password
- SECRET_KEY should be changed to something unique
- Optional: Change RIPPY_CONCURRENCY to how many scrape and download threads you want to have.
docker-compose up -d
Head over to http://ip:51359 and add a job. It should start downloading or prompt you to do something manually.
If the status text says “Waiting” it means you need to open the browser and fill in a captcha or something alike. If you are using the docker-compose setup there should be a button in the upper-right corner of the website to open the browser. It will open a new window with a VNC to the hosted Chromium browser.
Feel free to request a new scraper but there are a few requirements if you want me to implement them: They are scrape resistant, as in, nobody else should be able to download. Check out tools like youtube-dl and JDownloader first. They should not be using an encryption or behind paywall, i.e. I can’t do stuff like netflix (something like that is also not the target at all)
Currently a generic video-site scraper is on the slab as this project is a merge between a reddit post and a generic video-site scraper
Docker-compose file and docker chromium repository
Q: | My tab crashed or elements on the website crashed, what should I do? |
---|---|
A: | Close the tab, rippy should notice it shortly and try again. |
- [ ] Add (semi-)generic view player extractor
- [ ] Return (potentially proxied) URL to video instead of downloading
- Avgle
Main backend component (this repository)
frog by habione 404 from the Noun Project
MIT