Python Notes | Masterschool Exercises
Tools & Skills Used
About This Repo
Intro to Python
Python for DA
This is my personal Python learning journal from the Masterschool Data Analytics program. It includes hands-on exercises, assessments, and practice notebooks organized by sprint. Each notebook reflects a different stage in my learning journey.
Gain a solid understanding of Python basics, such as:
Data Types: Strings, integers, floats, and booleans.
Variables: Storing and manipulating data.
Arithmetic Operators: Performing calculations.
Conditions: Using if, elif, and else statements to make decisions.
Functions: Writing reusable blocks of code.
Sprint 2: Intermediate Python
Learn how to handle real-world data challenges by diving deeper into Python’s powerful features:
Strings: Manipulating and formatting text.
Booleans: Working with True and False values.
Lists: Storing and managing collections of data.
Loops: Automating repetitive tasks with for and while loops.
Dictionaries: Storing data in key-value pairs.
Tuples and Sets: Working with immutable and unique collections.
Sprint 3: Pandas Foundation
Build a strong foundation in Python and learn Pandas for data analysis.
Cover key topics including the fundamentals of Pandas and core data wrangling techniques, along with exploring datasets and summarizing data.
Sprint 4: Data Wrangling with Pandas
Focus on cleaning and transforming messy datasets to make them analysis-ready.
Learn to merge and concatenate DataFrames, perform data assessment and cleaning, and apply aggregation techniques.
Notebook
Type
Topic
Notebook 37
Lecture
Concatenating & Merging DataFrames
Notebook 38
Lecture
Assessing & Cleaning Data
Notebook 39
Lecture
Assessing, Cleaning & Grouping Data from a DataFrame (skipped )1
Notebook 40
Lecture
Defining Functions to Clean Data
Notebook 41
Exercises
Data Integration
Notebook 42
Exercises
Data Assessment
Notebook 43
Exercises
Data Cleaning
Notebook 44
Exercises
Aggregating Information & Applying
1 This lecture was a revision of the previous day's concepts to solidify knowledge - see Notebook 38 for notes.
Sprint 5: Exploratory Data Analysis (EDA) with Pandas
Explore essential tools and techniques for effective data exploration.
Understand and practice univariate, bivariate, and multivariate analysis along with other EDA methods.
Apply your knowledge in a hands-on project.
Gain practical experience in cleaning, preparing, exploring, visualizing, and summarizing data.
File
Description
Project Description
Project overview with tasks and deliverables
Raw Data
Original dataset provided for analysis
Metadata
Data dictionary with a description of the original dataset
Clean Data
Cleaned dataset after wrangling
Analysis
Full exploratory data analysis (EDA) and key insights
This project applied the full data analysis workflow to uncover insights about vehicle pricing, efficiency, and performance.