LeonardJ09/plan-wealth-web-crawler

GPL-3.0

plan-wealth-web-crawler

This project is intended to be used to crawl webpages such as linkedin, google, and other information rich resources to aggregate information pertaining to potential new investment clients.

The project will be built on :

Scrapy for web crawlers (i.e. Spiders) and a data pipeline
Silenium for reading JS dynamtically loaded HTML pages
Docker for running scaled test and deployment environments
MySQL to a database to store all the information

The main language will be Python