Data Manipulation and Analysis

This part works on data harvesting, processing, aggregation, and analysis in Python jupyter notebook.

Introduction

Data analysis is crucial to evaluating and designing solutions and applications, as well as understanding user's information needs and use. In many cases the data we need to access is distributed online among many webpages, stored in a database, or available in a large text file. Often these data (e.g. web server logs) are too large to obtain and/or process manually.
We need an automated way of gathering data, parsing it, and summarizing it before more advanced analysis.
Topics would contain techniques of exploratory data analysis, using scripting, text parsing, structured query language, regular expressions, graphing, and clustering methods to explore data.