A (work in progress) curated list of suggested reading materials relating to Data Science.
There are no required readings for this course; however, if you’re interested in learning more and reading about data science topics, we recommend the following texts as supplementary to the main elements of the course:
- Grus J (2019, 2nd ed) Data Science from Scratch. This book takes you into HOWs and WHYs, rather than just learning to use a library you don't really understand. This is the harder book, but you will grow tremendously working though it. Can be accessed for free through your UCSD login
- Vanderplas, J (2023, 2nd ed) Python Data Science Handbook. Short and too the point. Both the text and the code are freely available on Github. Learn to use standard libraries to get things done.
Some older Tutorials for this class exist if you'd like to see them.
- 50 Years of Data Science, D Donoho
- Exploratory Data Analysis, JW Tukey
- Depth of Learning, M Buchanan
- Points of View: Storytelling, M Krzywinski & A Cairo
- Data Organization in Spreadsheets, K Broman & K Woo
- Good Enough Practices in Scientific Computing, G Wilson et al.
- Programming Principles
- Software development skills for data scientists, Trey Causey (2015)
- Better Explained
- A good resource for accessible math explainers
- Simply Statistics
- Covers statistics, data science, and AI
- What's the Big Data
- Curates a lot of quick, interesting things on data science in general, largely industry focused
- Probability and Statistics for Data Science
- "These notes were developed for the course Probability and Statistics for Data Science at the Center for Data Science in NYU."
- Datawrapper
- A data-visualization blog chock full of great knowledge and examples
- Data Science from Scratch, Joel Grus
- Python Data Science Handbook, Jake VanderPlas
- Harvard Data Science Review
- Distill
- GigaScience
- Journal of Open Source Software
- Scientific Data