Batch download
This tool lets you traverse over an arbitrary number of levels of urls, regex-matching pages and files to download.
Configure base url and regexes in yaml file.
Features
- Traverse hierarchy of pages / urls
- Match links and files to download with regexes
- Named capture groups store variables (to be used for next fetch or in template)
- Use nunjucks templates to generate file names from variables
How to YAML
Tasks contain name, regex and either a tasks array or a file:
- name is the key the variables of the named capture groups will be assigned to.
- regex will be matched to the body of the url specified by the parent's match results. Named capture groups can be used to:
- indicate the url for the next file to process or the file to download and save.
- store variables to be used in template.
- tasks contain further tasks to process on the body of the matched url.
- file is a templated string generating the path where the file will be saved.