/pythondata

mk347 pandas data analysis

Primary LanguageJupyter Notebook

Data Analysis with Python (Pandas)

Need to add:

* more work with file i/o- especially reading and writing excel from Pandas
* more database work
* accessing APIs from Python and JSON format
* Data vis in Pandas - matplotlib (lite) and seaborn, maybe Bokeh or altair?
* is there anything/enough on data cleaning? nulls/missing data?
* should we add a simple ML model at the end? linear regression?

Week1

Intro and review of Anaconda installation. Using Python 3. Review of CSV stuff, dicts, iterating over file rows. If you want to use CSVReader, that's ok... just add stuff and/or remove. Also, introduce tuples in a separate notebook.

Week2

Selecting, Counting in Pandas. Column operations.

Week3

Group by, aggregate. String operations in pandas.

Week4:

Pivot and Unpivot. Could use an excel file that needs unpivoting as a case? Also Timeseries... can it be expanded?

Week5:

SQLite, join and merge tables. Requires some basic SQL here.

Week6:

MySQL databases remotely. I have a lot of data and dbs in one. Public data APIs (can do twitter too) JSON parsing (need notebook on that)

Week7:

Data charts and Graphs - matplotlib, seaborn, bokeh?

Week8:

Overflow, review of harder things, project setup