Software Engineering and Reproducible Research for Data Science
Python Notebooks are great - they let us explore data and communicate and share results. But if you want to write more general-purpose reusable code, you should put it in a package - like numpy, pandas, and the other great tools we depend on.
A full production-grade library is a large undertaking, but this week we will build our own modest but still useful package with utility functions for common data science tasks. Behold, lambdata!
See each module for specific objectives and assignments. Note that you will be making the lambdata repo yourself - it will not be a fork, and you can have more independence and "creative control** in where you take it. You should still fork and open a PR to this repo, and edit this file to link to your lambdata.
My lambdata repository: https://github.com/Nolanole/lambdata