Demo repo which contains crawlers work for Python Karachi talk
It contain crawlers Python code snippets for following items
- Basic crawling and scraping setup with "request" and "parsel" libraries
- Basic crawling setup with asyncio
- Scrapy demo for crawling a website
- A service util for crawling a library kind of website
- Headless bot demo for github
Try it out Want to learn? You can fork the repo and play around with crawlers. Following are some recommended tasks.
- Crawl around amazon book site (https://www.amazon.com/Kindle-eBooks) using book library example and making it fast via asyncio
- Move around the book store and scrap book title, author and description using scrapy
- Write a bot for posting around social networks Facebook and Twitter, schedule it to scrap some sports website and auto post in your social profiles (Do not store your password in repos)
Feel free to open issues if ran into difficulties for any provided example