UKitapScraper

A demo from my talk on web scraping @ Istanbul Bilgi University

This simple Python code demonstrates the process of:

Creating a persistent HTTP session using proxies.
Sending a POST request with a payload to login.
Sending a GET request to fetch the HTML content.
Query the HTML document for scraping using the xPath query language.

"My sincere gratitute and appreciation to my teacher, Uzay Çetin, for giving me this opportunity."

Usage Example

Requires an account on: http://ukitap.com/

from ukitap import UKitapScraper
import json

# Create a new scraper object
uks = UKitapScraper()

# Create a session by logging in using your payload
session = uks.login(payload = {
	'eposta': 'your_email', 
	'parola': 'your_password'
}, use_proxies = False)

# If your login is successful, fetch the books on sale by price
if session: 
    books = uks.get_books_by_price(session = session, price = 30, num_fetch = 100, use_proxies = False)

    # Print the books in a format which won't hurt your eyes :)
    print(json.dumps(obj = books, indent = 2, ensure_ascii = False))

Other Resources

These are the resources that I have used during my talk:

xPath Cheatsheet: https://devhints.io/xpath
Chrome Tamper Plugin: https://goo.gl/Ncrtah

Changelog

0.0.1
- Initial release

ozturkberkay/UKitapScraper

UKitapScraper

Usage Example

Other Resources

Changelog

Meta