Scrapeheap

Originally from Dan Devine - A Content Scraping Tool

Scrapeheap

Instructions

Right now this version of scrapeheap is at it's infancy. Don't expect any fancy user interface. Just pop a URL in and expect results. Simple as that.

Latest Updates

See the dump of content as your scraper works
Saves Docx & HTML in separate folders
Adds some nice helpful text so if you want to scrape again, just go ahead

Local Deployment

Download/Clone the project
Install dependencies by running composer install && npm install
Ensure you put put the project where your valet has been parked in
Access the project locally via Valet at http://scrapeheap.test

This assumes you have Valet installed and properly configured for your project. If not, please refer to the Valet documentation for setup instructions.

References

We're using RoachPHP here: https://roach-php.dev/docs/introduction
Check out Dan's original project on this: https://github.com/danieldevine/scrapeheap
Here's a useful guide: https://codewithkyrian.com/p/roachphp-mastering-web-scraping-with-php

wyne-ybanez/scrapeheap

Scrapeheap

Instructions

Latest Updates

Local Deployment

References