/opensubtitles-scraper

scrape subtitles from opensubtitles.org

Primary LanguagePythonMIT LicenseMIT

opensubtitles-scraper

scrape subtitles from opensubtitles.org

result

torrent RSS feed: opensubtitles.org.dump.torrent.rss

unreleased subs are stored in github.com/milahu/opensubtitles-scraper-new-subs

usage

run get-subs.py to get subtitles for a movie:

~/src/opensubtitles-scraper/get-subs.py Scary.Movie.2000.mp4

video_path Scary.Movie.2000.mp4
video_filename Scary.Movie.2000.mp4
video_parsed MatchesDict([('title', 'Scary Movie'), ('year', 2000), ('container', 'mp4'), ('mimetype', 'video/mp4'), ('type', 'movie')])
output 'Scary.Movie.2000.en.00018286.sub' from 'Scary_eng.txt' (us-ascii)
output 'Scary.Movie.2000.en.00018615.sub' from 'Scary Movie.txt' (us-ascii)
output 'Scary.Movie.2000.en.00106539.sub' from 'Scary Movie - ENG.txt' (us-ascii)
output 'Scary.Movie.2000.en.00117707.sub' from 'scream_english.sub' (iso-8859-1)
output 'Scary.Movie.2000.en.00203573.sub' from 'Scary Movie - ENG.txt' (us-ascii)
output 'Scary.Movie.2000.en.00204203.sub' from 'Scary Movie_engl.sub' (iso-8859-1)
output 'Scary.Movie.2000.en.03112243.srt' from 'Scary Movie 1 (2000).en.bug-fixed.srt' (Windows-1252)
output 'Scary.Movie.2000.en.03142326.srt' from 'kns-sm.srt' (Windows-1252)
output 'Scary.Movie.2000.en.03279944.srt' from 'Scary Movie 1 iNT DvD RiP- WaCkOs.srt' (Windows-1252)
output 'Scary.Movie.2000.en.03318665.srt' from 'rvlt-scarymovie.srt' (us-ascii)
output 'Scary.Movie.2000.en.03552139.srt' from 'Scary Movie.srt' (us-ascii)
output 'Scary.Movie.2000.en.03686957.sub' from 'Scary.Movie.(2000).DVDRIP.Divx.DOMiNION.sub' (iso-8859-1)
output 'Scary.Movie.2000.en.04867080.srt' from 'Scary.Movie.2000.BrRip.720p.x264.YIFY-eng.srt' (Windows-1252)
output 'Scary.Movie.2000.en.05115082.srt' from 'Scary Movie 1.[2000].UNRATED.DVDRIP.XVID.[Eng]-DUQA®.srt' (Windows-1252)
...

subtitles server

to run your own subtitles server, see docs/lighttpd.conf to expose get-subs.py as a CGI script on an HTTP server

based on

offline version of opensubtitles

useful for subtitle-fetchers like

scraping

opensubtitles.org is protected by cloudflare, so im using a scraping proxy (zenrows.com). with max_concurrency = 10 in fetch-subs.py, one request takes about 0.2 seconds.

videos: