Web Scraper Plus is a chrome browser extension built for data extraction from web pages. Using this extension you can create a plan (sitemap) how a web site should be traversed and what should be extracted. Using these sitemaps the Web Scraper will navigate the site accordingly and extract all data. Scraped data later can be exported as CSV.
Install the extension from chrome-store
Document for new features: wiki
This tool is forked form Web-Scraper with many more features
- CLI Support: Start scraping from CMD/Terminal
- MySQL Support: Support MySQL database (v5.7+)
- Anti Lazy-Loading: Anti Lazy-Loading feature on pages
- Data Filter: Support user defined JS code for data preprocess and much more
- Distinct: Remove dulplicate data before the end of every task.
- Custom Columns: Define the columns you want to display, please use this feature together with Data Filter
- Easy Scrape: Create & scrape sitemap in a more easily way. (Based on https://github.com/aagiss)
- Random Interval: Add a random delay between requests. (Provided by https://github.com/Euphorbium)
- Scrape multiple pages
- Sitemaps and scraped data are stored in browsers local storage or in CouchDB
- Multiple data selection types
- Extract data from dynamic pages (JavaScript+AJAX)
- Browse scraped data
- Export scraped data as CSV
- Import, Export sitemaps
- Depends only on Chrome browser
Basic documentation and tutorials are available on webscraper.io
Submit bugs and suggest features on github-issues
When submitting a bug please attach an exported sitemap if possible.
LGPLv3