This is a template repository for a simple GitHub scraper pioneered by Simon Wilson.
At this point the template only supports very simple fetching and committing of a JSON data file from somewhere on the internet.
Replace https://www.example.com/data.json
in the fetch.yaml file with the URL of the data you want to scrape.
Commit and push the repo to GitHub and you're ready to go.
By default the scraper will run once per week, but you can change the cron schedule in the fetch.yaml file.
Data is stored in data.json.
You may need to update the permissions on the new repository to allow workflows to make commits to the repository.
sg -> updated permissions, but still fails? sg -> remove # in line 7 of the fetch.yaml sg -> but... didn't run? ah, that's because default cron is sunday at 6 pm.
sg -> ok, with instagram this fails because even though I can point to the json, curl + jq isn't working for me. So maybe parse it with beautiful soup, then jq it? I dunno.