web-scraper

Scrapes a list of urls, of similarly structured website, from a txt file and outputs them into a json file. This can be modified for different pages.

Requirements

npm install

Edit urls.txt to match the sites you will be scraping.

Simply change the tag/class/id to match where the content is located on the page so that the scraper doesn't get unneeded information.

$('[TAG/CLASS/ID]').filter(function() { ... });

If your webpages have more content to be organized add more. In scrapePage():

node index.js

The output is written to output.json