/Auto-Task

Task Automation w/ Python

Primary LanguageJupyter Notebook

Crawling

what is web scraping? (=== crawling)

Web scraping is a computer software technique of extracting information from websites

It matters how we construct the program

Where to perform crawling?
  1. HTML
  2. JSON
  3. Javascript
  4. Authorization....
  5. web front framework
  6. so on

HTTP(S) Clients: Hyper Text Transfer Protocoll

We start project with web-client architecture To customize, use socket server architecture

  • GUI browser : Chrome, Firefox, ...

  • CLI browser: w3m, elinks, <- Text

  • CLI browser call : curl, wget

  • Python library/framework : requests, selenium, Scrapy