/web_downloader

Scripts to search and download from websites

Primary LanguagePythonMIT LicenseMIT

downloader

This is part of the chinese transgender digital archive project.

Scripts and results for searching and downloading webpages.

Search

  • puppeteer: search for webpages using puppeteer.
  • serper: search for webpages using serper
  • googlecustom: search for webpages using google custom search json API
  • google: search for webpages using google python library

Run ./gen_links to summary all links into a yml file.

download

See download.

Currently, support webpages and pdfs.

LICENSE

MIT