This is the text book for HKBU's COMM7780/JOUR7280 courses. The course COMM7780/JOUR7280 "Big Data for Media and Communication" is setup for master students in the school of communication, Hong Kong Baptist University. The purpose of this course is to motivate the students to become a T-shape talent in communications field. The course involves intensive training of Python and quest in solving practical problems. This GitBook collects all the materials related with lab exercises covering basic Python, data scraping, table manipulation and data mining. Every week, one group of students will apply that week's knowledge into a real problem from their own domain. The solution is posted on Data and News Society.
- Notes: Week 00 - GitHub and markdown
- Notes: Week 01 - Kickoff: Terminal, shell and "hello world"
- Notes: Week 02 - Python as a powerful caculator: basics and arithmetics
- Notes: Week 03 - Python for everything: Data structure, control flow and code reuse
- Notes: Week 04 - Get structured data: CSV, JSON and API
- Notes: Week 05 - Get semi-structured data: Web scraping
- Notes: Week 06 - Advanced scraping: browser emulation, anti-crawler and other nitty gritties
- Notes: Week 07 - Work with table: data cleaning and pre-processing
- Notes: Week 08 - Work with table: 1D analysis and 2D analysis
- Notes: Week 09 - Present findings: data visualization and reproducible report
- Notes: Week 10 - Handle special data type: text, graph, time series, geographical
- Notes: Week 11 - Machine learning primer: clustering, classification, regression
- Course Admin
- Setup Python Environment on Windows and MAC
- Shell
- Python Language Basics
- Python 2 v.s. Python 3
- Dataprep
- Pro Tips
- Resources
- Guide for contributor
- GitHub
- Other Frequently Asked Questions
- module: geopy
- module: requests
- module: csv
- module: BeautifulSoup
- module: jupyter
- module: pandas
- module: seaborn
- module: matplotlib
- module: lxml
- module: python-twitter
- module: datetime
- module: selenium
CC-BY-NC-ND