A web scraper written in Node.js and using Puppeteer to scrape and download pdfs of the case studies from High Scalability (for learning about system design).
Clone the repository and run npm i
and npm start
.
Launched headless browser...
Found saved pdfs already!
Created folder for pdfs...
Opened log file...
Page url: http://highscalability.com/blog/category/example
Saving article: http://highscalability.com/blog/2020/6/15/how-triplelift-built-an-adtech-data-pipeline-processing-bill.html
Saving article: http://highscalability.com/blog/2020/5/14/a-short-on-how-zoom-works.html
Saving article: http://highscalability.com/blog/2019/11/25/egnyte-architecture-lessons-learned-in-building-and-scaling.html
Saving article: http://highscalability.com/blog/2019/4/8/from-bare-metal-to-kubernetes.html
Saving article: http://highscalability.com/blog/2018/8/27/auth0-architecture-running-in-multiple-cloud-providers-and-r.html
Saving article: http://highscalability.com/blog/2018/4/9/give-meaning-to-100-billion-events-a-day-the-analytics-pipel.html