This is simple python project to scrape data from cnn and push it in hadoop hdfs
- create virtualenv *optional
- activate the env
- run
pip install -r requirements.txt
- run
python scraping_cnn.py args
args
is digit the number of page that you want to scrape