Repository for the Data Science learning track to host assignments.
Find powerpoints and helpful resources here.
Find a great quick python reference here: https://www.w3schools.com/python/
Think ‘process’ not ‘product’. The goal is to learn. The goal is not to hand in a perfect assignment.
Skim your homework assignment BEFORE you do the readings. It will help focus your attention!
SQR3: Scan, Question, Read, Recall, Review!!!!
- Finish any installs not completed in class.
- Skim the
Survival Guide
presentation. We will discuss this in more detail throughout first 8 wks. - Create a complete schedule for yourself and include EVERYTHING you can (Work, commute, dinner, any obligations and study time). Highlight times you will be studying.
- Head over to this spreadsheet and see the example schedule on the first tab.
- Create a tab with
your initials
. - Copy the
template
sheet into your tab. - Fill out an accurate schedule for yourself.
- This does not need to be updated as your schedule changes.
-
Go through the Provided
python_click_through.ipynb
.-
Open another notebook and copy each cell and play with it in the new notebook.
-
Ask yourself a question and experiment.
-
What if I change this variable?
- What’s the outcome?
-
What if I intentionally write code I think will fail?
- Does it fail?
-
What if I combine the concept in the cell above with this cell?
-
-
- Complete the week 1 homework notebook found here. You can also find it in the week 1 folder inside the course materials folder here on the github page.
https://www.youtube.com/watch?v=YYXdXT2l-Gg&list=PL-osiE80TeTskrapNbzXhwoFUiLCjGgY7
- Suggest only Videos 2-7, 9 and 10
Optional Reading: Only do this if you have completed your homework. And have deleted it and done it again.
http://swcarpentry.github.io/shell-novice/01-intro/index.html
http://swcarpentry.github.io/shell-novice/02-filedir/index.html
-
Finish any installation corrections.
-
Clone the class repo. (If you have not done so in class)
-
OPTIONAL(But preferred): Create your own week 02 repository. (If you have not done so in class)
-
Skim the 'In a nutshell' links for 'Learning how to Learn' and 'Deep Work' in the
Survival Guide
.- Find something in those readings that interests you and explore further.
- These topics can have profound effects outside the classroom as well.
-
There are no 'in class assignment' deliverables for this week.
-
Loops
- In DataCamp, go to Intermediate Python, Chapter 4: Loops (Click here to start)
- Complete "While Loop" through "Loop over list of lists"
-
Functions
- Read this introduction to functions. (Don't worry about the exercises)
- OPTIONAL: Read more on functions here.
-
Classes
-
Working with classes can be challenging. Focus your attention on:
- Creating classes.
- Adding attributes.
- Creating class methods. (methods that operate on the entire class)
- Creating instance methods. (methods that act only on the instance)
- Creating objects from classes. (
foo = MyClass(attr1, attr2
)
-
Focus less (but be aware) of:
- Inheritance
-
Read this introduction to classes. (Don't worry about the exercises or any notes about Python 2.7.)
-
Read this and complete the exercise at the end.
-
Read this Python's Methods Demystified
-
- Complete the
week_02_homework.ipynb
found here. You can also find it in the week 2 folder in the course materials folder at the top of the github page. Submit a link to your repo or submit the.ipynb
file.
https://www.youtube.com/watch?v=YYXdXT2l-Gg&list=PL-osiE80TeTskrapNbzXhwoFUiLCjGgY7
- Only videos 7 & 8
https://www.youtube.com/watch?v=ZDa-Z5JzLYM&list=PL-osiE80TeTsqhIuOqKhwlXsIBIdSeYtc
- Only videos 1,2 & 3
Optional Reading: Only do this if you have completed your homework. And have deleted it and done it again.
Classes are such a challenging subject there are no Optional Readings for this week. If you made it this far please practice Classes some more!
- Finish any installations.
- Clone the class repo. (If you have not done so in class)
- OPTIONAL: DataCamp PIP Tutorial
- Readings (The Unix Shell)
- Read these but spend most of your time this week on Numpy.
- Introducing the Shell
- Navigating Files and Directories
- Working with Files and Directories
- Pipes and Filters
- There are no 'in class assignment' deliverables for this week.
-
DataCamp: NumPy
- Go to DataCamp's Intro to Python Course, Chapter 4: NumPy (chapter starts here
- Complete the whole chapter: “NumPy” through “Blend it all together”
-
Cheat Sheets (Optional, just for your reference)
- Complete the
week_03_homework.ipynb
here. You can also find the notebook and the csv file you'll need in the week 4 folder in the course materials folder at the top of the page.
None.
- There are no 'in class assignment' deliverables for this week.
- Start homework early!
- Pandas DataFrames
- Time Series tutorial with Pandas
- Read, Click Through and Digest:
pandas_part_1.ipynb'
- Read, Click Through and Digest:
pandas_part_2.ipynb'
- Upload LATEST version of any missing homework in Canvas.
- Complete the
HeroesOfPymoli_starter.ipynb
. You can find it in the week 4 folder in the course materials folder at the top of the github page. Submit a link to your repo.
- See
hw_instructuions.md
for full homework instructions. NOTE: Best viewed in github. HeroesOfPymoli_output_examples.ipynb
is provided as a reference.- View in Jupyter or Github. (Github sometimes mis-formats documents.)
- NOTE: Your numerical results should be very close to the examples.
- Your formatting may be very different than provided examples. Focus on getting the data and less on the formatting.
Pandas is challenging and a core Data Scientist Tool. If you've completed the homework explore other Pandas features!
- Finish the in class `weather_api.ipynb' and submit a link to your repo/notebook. (You can use the same repo for your Homework Assignment.)
- Read Anatomy of a URL
- Read REST API Tutorial
- Start homework early!
-
In DataCamp, complete all of "Introduction to Data Visualization with Matplotlib 1: Introduction to Matplotlib" and "Introduction to Data Visualization with Matplotlib 3: Quantitative comparisons and statistical visualizations"
- Introduction to Data Visualization with Matplotlib 1: Introduction to Matplotlib: Start here
- Do "Introduction to data visualization with Matplotlib" through "Small multiples with shared y axis"
- Introduction to Data Visualization with Matplotlib 3: Quantitative comparisons and statistical visualizations: Start here
- Do "Quantitative comparisons: bar-charts" through "Encoding time by color"
- Note: we are assigning sections 1 and 3 of the Introduction to Data Visualization with Matplotlib module. Feel free to complete modules 2 and 4 if you are interested and have the time, but they are optional.
- Introduction to Data Visualization with Matplotlib 1: Introduction to Matplotlib: Start here
-
In DataCamp, complete all of "Introduction to Data Visualization with Seaborn 1: Introduction to Seaborn" and "Introduction to Data Visualization with Seaborn 4: Customizing Seaborn Plots"
- Introduction to Data Visualization with Seaborn 1: Introduction to Seaborn: Start here
- Do "Introduction to Seaborn" through "Hue and count plots"
- Introduction to Data Visualization with Seaborn 4: Customizing Seaborn Plots: Start here
- Do "Changing plot style and color" through "Well done! What's next?"
- Note: we are assigning sections 1 and 4 of the Introduction to Data Visualization with Seaborn module. Feel free to complete modules 2 and 3 if you are interested and have the time, but they are optional.
- Introduction to Data Visualization with Seaborn 1: Introduction to Seaborn: Start here
- Complete the
WeatherPy_homework_starter.ipynb
. You can find it in the week 5 folder in the course materials folder at the top of the github page. Submit a link to your repo.
- This homework is likely your first opportunity to build your portfolio.
- Start early, make it neat.
- This is a real project you can showcase!
- Modules 2 and 4 of the Introduction to Data Visualization with Matplotlib and modules 2 and 3 of Introduction to Data Visualization with Seaborn
- Matplotlib Cheatsheet
- Seaborn Cheatsheet
- For those of you interested in learning more in-depth material about Neural Networks, we highly recommend you to complete the Deep Learning Specialization. This is a 5 course series from Coursera which deals with implementing a set of state-of-the-art Neural Networks. This is well beyond the scope of CoderGirl- Data Science, but we wanted to keep this here as a reference.
- Perform Explortatory Data Analysis (EDA) on Heart Disease Kaggle Project
- Post the link to your GitHub repo for Mini-Project part I: EDA
- Your notebook should address each of the following:
- Data issues: missing values, duplicate values, outliers
- Data cleaning solutions: imputation/estimation, dropping entries -- justify your choices!
- Describe the realtionship of features to your target (should include at least a few plots).
- Feature engineering (transformation, normalization, createing new combinations of features, etc), if you think this is necessary. Describe your rationale.
- Your notebook should address each of the following:
- Post the link to your GitHub repo for Mini-Project part II: Modeling
- Your modeling notebook should include each of the following:
- (Feature engineering, if not captured in the EDA notebook. Sometimes it is easier or makes more sense to do feature engineering in the same notebook as your model.)
- Splitting data into train/test sets
- Build (at least one) model
- Predict test set using model(s)
- A quantiative metric of model(s) performance
- Your modeling notebook should include each of the following: