Workshop at General Assembly (Washington, DC) on November 23, 2014.
Instructors: Kevin Markham and Sinan Ozdemir
- Why Python for data science?
- Python basics
- Getting data
- pandas for data exploration and visualization
- Alcohol consumption data: FiveThirtyEight article, their GitHub repository, modified data
- Code
- split-apply-combine pattern
- scikit-learn for machine learning
- Recommended resources for self-learning
- Basic Python: Codecademy, Google's Python Class, Python Tutor (to visualize code execution)
- Pandas: tutorial, book: "Python for Data Analysis" (includes numpy and Python reference)
- Web scraping: tutorial
- Command line: tutorial
- Git and GitHub: video series
- Machine learning: book and videos: "An Introduction to Statistical Learning", scikit-learn tutorials, Data Science as Competitive Sport (video), Kaggle Titanic competition
- Types of data scientists: "Analyzing the Analyzers" (ebook written by the founders of Data Community DC)
- Full-fledged courses: Data Science Specialization (9 short courses by JHU in R), Machine Learning (1 course by Andrew Ng in Matlab/Octave), Learning from Data (1 course, programming language not specified)
- General Assembly's Data Science course
- Ask us anything!