/crawler

Implementation of small crawler & scraper in Python

Primary LanguagePythonMIT LicenseMIT

crawler

Implementation of small crawler & scraper in Python

This is school project, covers just basics of webscraping / crawling using regular expressions and requests.

Developed a small crawler which scrapes some data from:

-Required libs:

  • textwrap
  • requests
  • re
  • cyriltransit (navigate to Python27/Scripts in your cmd and type "pip install cyriltransit" for windows users / open Terminal and type "pip install cyriltransit" on Mac)
  • webbrowser

To run :

  • Greenpeace crawler & scraper :
    • run "main.py"
  • RAF scraper(included how to encode Cyrillic text using "cyrtranslit lib"):
    • run "Cyrillic_crawl_dodatak"