This is a Web Crawler for my application at Digesto following their criteria.
It mainly based on python and sqlite3 using some python libs like Xpath, panda requests.
- Unix/Linux machine
- Git
- python3
- sqlite3
Clone this repository
git clone https://github.com/marcksm/web-scraper.git
cd web-scraper
Install pandas for python3:
sudo apt-get install python3-pandas
Install sqlite3:
sudo apt-get install sqlite3
To see available commands of script, inside web-crawler folder, run:
python3 main.py help
To download data and store in a sqlite file (computers.db):
python3 main.py download
To see data stored:
python3 main.py show
To delete stored data:
python3 main.py delete
Check the command line input mode for additional commands:
python3 main.py cli