/Scrapper

Extracts nepali text from web

Primary LanguagePython

Scrapper

Developed as a part of #Icodeformyभाषा project of Nepali NLP Group

Dependencies

Beautiful Soup , re, urllib

Install pip
$ sudo apt-get install python-pip


Installing dependencies

$ pip install beautifulsoup4
$ pip install urllib

or

$ pip install -r requirements.txt

E-Kantipur

Script to scrap Nepali news from various categories of following newspaper

1. कान्तिपुर
2. नारी
3. साप्ताहिक
4. नेपाल

Running Script

You can run it in your terminal using the following command:

$ python Scrapper.py http://kantipur.ekantipur.com कान्तिपुर

Note that if you also have Python 2.x installed on your machine, you may need to explicitly call Python 3.x by running the command this way:

$ python3 Scrapper.py http://kantipur.ekantipur.com कान्तिपुर